Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

How-To Tutorials - Programming

1081 Articles
article-image-setting-software-infrastructure-cloud
Packt
23 Sep 2014
42 min read
Save for later

Setting up of Software Infrastructure on the Cloud

Packt
23 Sep 2014
42 min read
In this article by Roberto Freato, author of Microsoft Azure Development Cookbook, we mix some of the recipes of of this book, to build a complete overview of what we need to set up a software infrastructure on the cloud. (For more resources related to this topic, see here.) Microsoft Azure is Microsoft’s Platform for Cloud Computing. It provides developers with elastic building blocks to build scalable applications. Those building blocks are services for web hosting, storage, computation, connectivity, and more, which are usable as stand-alone services or mixed together to build advanced scenarios. Building an application with Microsoft Azure could really mean choosing the appropriate services and mix them together to run our application. We start by creating a SQL Database. Creating a SQL Database server and database SQL Database is a multitenanted database system in which many distinct databases are hosted on many physical servers managed by Microsoft. SQL Database administrators have no control over the physical provisioning of a database to a particular physical server. Indeed, to maintain high availability, a primary and two secondary copies of each SQL Database are stored on separate physical servers, and users can't have any control over them. Consequently, SQL Database does not provide a way for the administrator to specify the physical layout of a database and its logs when creating a SQL Database. The administrator merely has to provide a name, maximum size, and service tier for the database. A SQL Database server is the administrative and security boundary for a collection of SQL Databases hosted in a single Azure region. All connections to a database hosted by the server go through the service endpoint provided by the SQL Database server. At the time of writing this book, an Azure subscription can create up to six SQL Database servers, each of which can host up to 150 databases (including the master database). These are soft limits that can be increased by arrangement with Microsoft Support. From a billing perspective, only the database unit is counted towards, as the server unit is just a container. However, to avoid a waste of unused resources, an empty server is automatically deleted after 90 days of non-hosting user databases. The SQL Database server is provisioned on the Azure Portal. The Region as well as the administrator login and password must be specified during the provisioning process. After the SQL Database server has been provisioned, the firewall rules used to restrict access to the databases associated with the SQL Database server can be modified on the Azure Portal, using Transact SQL or the SQL Database Service Management REST API. The result of the provisioning process is a SQL Database server identified by a fully -qualified DNS name such as SERVER_NAME.database.windows.net, where SERVER_NAME is an automatically generated (random and unique) string that differentiates this SQL Database server from any other. The provisioning process also creates the master database for the SQL Database server and adds a user and associated login for the administrator specified during the provisioning process. This user has the rights to create other databases associated with this SQL Database server as well as any logins needed to access them. Remember to distinguish between the SQL Database service and the famous SQL Server engine available on the Azure platform, but as a plain installation over VMs. In the latter case, you will continue to own the complete control of the instance that runs the SQL Server, the installation details, and the effort to maintain it during the time. Also, remember that the SQL Server virtual machines have a different pricing from the standard VMs due to their license costs. An administrator can create a SQL Database either on the Azure Portal or using the CREATE DATABASE Transact SQL statement. At the time of this writing this book, SQL Database runs in the following two different modes: Version 1.0: This refers to Web or Business Editions Version 2.0: This refers to Basic, Standard, or Premium service tiers with performance levels The first version is deprecating in few months. Web Edition was designed for small databases under 5 GB and Business Edition for databases of 10 GB and larger (up to 150 GB). There is no difference in these editions other than the maximum size and billing increment. The second version introduced service tiers (the equivalent of Editions) with an additional parameter (performance level) that sets the amount of dedicated resource to a given database. The new service tiers (Basic, Standard, and Premium) introduced a lot of advanced features such as active/passive Geo-replication, point-in-time restore, cross-region copy, and restore. Different performance levels have different limits such as the Database Throughput Unit (DTU) and the maximum DB size. An updated list of service tiers and performance levels can be found at http://msdn.microsoft.com/en-us/library/dn741336.aspx. Once a SQL Database has been created, the ALTER DATABASE Transact SQL statement can be used to alter either the edition or the maximum size of the database. The maximum size is important as the database is made read only once it reaches that size (with the The database has reached its size quota error message and number 40544). In this recipe, we'll learn how to create a SQL Database server and a database using the Azure Portal and T-SQL. Getting Ready To perform the majority of operations of the recipe, just a plain internet browser is needed. However, to connect directly to the server, we will use the SQL Server Management Studio (also available in the Express version). How to do it... First, we are going to create a SQL Database server using the Azure Portal. We will do this using the following steps: On the Azure Portal, go to the SQL DATABASES section and then select the SERVERS tab. In the bottom menu, select Add. In the CREATE SERVER window, provide an administrator login and password. Select a Subscription and Region that will host the server. To enable access from the other service in WA to the server, you can check the Allow Windows Azure Services to access the server checkbox; this is a special firewall rule that allows the 0.0.0.0 to 0.0.0.0 IP range. Confirm and wait a few seconds to complete the operation. After that, using the Azure Portal,.go to the SQL DATABASES section and then the SERVERS tab. Select the previously created server by clicking on its name. In the server page, go to the DATABASES tab. In the bottom menu, click on Add; then, after clicking on NEW SQL DATABASE, the CUSTOM CREATE window will open. Specify a name and select the Web Edition. Set the maximum database size to 5 GB and leave the COLLATION dropdown to its default. SQL Database fees are charged differently if you are using the Web/Business Edition rather than the Basic/Standard/Premium service tiers. The most updated pricing scheme for SQL Database can be found at http://azure.microsoft.com/en-us/pricing/details/sql-database/ Verify the server on which you are creating the database (it is specified correctly in the SERVER dropdown) and confirm it. Alternatively, using Transact SQL, launch Microsoft SQL Server Management Studio and open the Connect to Server window. In the Server name field, specify the fully qualified name of the newly created SQL Database server in the following form: serverName.database.windows.net. Choose the SQL Server Authentication method. Specify the administrative username and password associated earlier. Click on the Options button and specify the Encrypt connection checkbox. This setting is particularly critical while accessing a remote SQL Database. Without encryption, a malicious user could extract all the information to log in to the database himself, from the network traffic. Specifying the Encrypt connection flag, we are telling the client to connect only if a valid certificate is found on the server side. Optionally check the Remember password checkbox and connect to the server. To connect remotely to the server, a firewall rule should be created. In the Object Explorer window, locate the server you connected to, navigate to Databases | System Databases folder, and then right-click on the master database and select New Query. 18. Copy and execute this query and wait for its completion:. CREATE DATABASE DATABASE_NAME ( MAXSIZE = 1 GB ) How it works... The first part is pretty straightforward. In steps 1 and 2, we go to the SQL Database section of the Azure portal, locating the tab to manage the servers. In step 3, we fill the online popup with the administrative login details, and in step 4, we select a Region to place the SQL Database server. As a server (with its database) is located in a Region, it is not possible to automatically migrate it to another Region. After the creation of the container resource (the server), we create the SQL Database by adding a new database to the newly created server, as stated from steps 6 to 9. In step 10, we can optionally change the default collation of the database and its maximum size. In the last part, we use the SQL Server Management Studio (SSMS) (step 12) to connect to the remote SQL Database instance. We notice that even without a database, there is a default database (the master one) we can connect to. After we set up the parameters in step 13, 14, and 15, we enable the encryption requirement for the connection. Remember to always set the encryption before connecting or listing the databases of a remote endpoint, as every single operation without encryption consists of plain credentials sent over the network. In step 17, we connect to the server if it grants access to our IP. Finally, in step 18, we open a contextual query window, and in step 19, we execute the creation query, specifying a maximum size for the database. Note that the Database Edition should be specified in the CREATE DATABASE query as well. By default, the Web Edition is used. To override this, the following query can be used: CREATE DATABASE MyDB ( Edition='Basic' ) There's more… We can also use the web-based Management Portal to perform various operations against the SQL Database, such as invoking Transact SQL commands, altering tables, viewing occupancy, and monitoring the performance. We will launch the Management Portal using the following steps: Obtain the name of the SQL Database server that contains the SQL Database. Go to https://serverName.database.windows.net. In the Database fields, enter the database name (leave it empty to connect to the master database). Fill the Username and Password fields with the login information and confirm. Increasing the size of a database We can use the ALTER DATABASE command to increase the size (or the Edition, with the Edition parameter) of a SQL Database by connecting to the master database and invoking the following Transact SQL command: ALTER DATABASE DATABASE_NAME MODIFY ( MAXSIZE = 5 GB ) We must use one of the allowable database sizes. Connecting to a SQL Database with Entity Framework The Azure SQL Database is a SQL Server-like fully managed relation database engine. In many other recipes, we showed you how to connect transparently to the SQL Database, as we did in the SQL Server, as the SQL Database has the same TDS protocol as its on-premise brethren. In addition, using the raw ADO.NET could lead to some of the following issues: Hardcoded SQL: In spite of the fact that a developer should always write good code and make no errors, there is the finite possibility to make mistake while writing stringified SQL, which will not be verified at design time and might lead to runtime issues. These kind of errors lead to runtime errors, as everything that stays in the quotation marks compiles. The solution is to reduce every line of code to a command that is compile time safe. Type safety: As ADO.NET components were designed to provide a common layer of abstraction to developers who connect against several different data sources, the interfaces provided are generic for the retrieval of values from the fields of a data row. A developer could make a mistake by casting a field to the wrong data type, and they will realize it only at run time. The solution is to reduce the mapping of table fields to the correct data type at compile time. Long repetitive actions: We can always write our own wrapper to reduce the code replication in the application, but using a high-level library, such as the ORM, can take off most of the repetitive work to open a connection, read data, and so on. Entity Framework hides the complexity of the data access layer and provides developers with an intermediate abstraction layer to let them operate on a collection of objects instead of rows of tables. The power of the ORM itself is enhanced by the usage of LINQ, a library of extension methods that, in synergy with the language capabilities (anonymous types, expression trees, lambda expressions, and so on), makes the DB access easier and less error prone than in the past. This recipe is an introduction to Entity Framework, the ORM of Microsoft, in conjunction with the Azure SQL Database. Getting Ready The database used in this recipe is the Northwind sample database of Microsoft. It can be downloaded from CodePlex at http://northwinddatabase.codeplex.com/. How to do it… We are going to connect to the SQL Database using Entity Framework and perform various operations on data. We will do this using the following steps: Add a new class named EFConnectionExample to the project. Add a new ADO.NET Entity Data Model named Northwind.edmx to the project; the Entity Data Model Wizard window will open. Choose Generate from database in the Choose Model Contents step. In the Choose Your Data Connection step, select the Northwind connection from the dropdown or create a new connection if it is not shown. Save the connection settings in the App.config file for later use and name the setting NorthwindEntities. If VS asks for the version of EF to use, select the most recent one. In the last step, choose the object to include in the model. Select the Tables, Views, Stored Procedures, and Functions checkboxes. Add the following method, retrieving every CompanyName, to the class: private IEnumerable<string> NamesOfCustomerCompanies() { using (var ctx = new NorthwindEntities()) { return ctx.Customers .Select(p => p.CompanyName).ToArray(); } } Add the following method, updating every customer located in Italy, to the class: private void UpdateItalians() { using (var ctx = new NorthwindEntities()) { ctx.Customers.Where(p => p.Country == "Italy") .ToList().ForEach(p => p.City = "Milan"); ctx.SaveChanges(); } } Add the following method, inserting a new order for the first Italian company alphabetically, to the class: private int FirstItalianPlaceOrder() { using (var ctx = new NorthwindEntities()) { var order = new Orders() { EmployeeID = 1, OrderDate = DateTime.UtcNow, ShipAddress = "My Address", ShipCity = "Milan", ShipCountry = "Italy", ShipName = "Good Ship", ShipPostalCode = "20100" }; ctx.Customers.Where(p => p.Country == "Italy") .OrderBy(p=>p.CompanyName) .First().Orders.Add(order); ctx.SaveChanges(); return order.OrderID; } } Add the following method, removing the previously inserted order, to the class: private void RemoveTheFunnyOrder(int orderId) { using (var ctx = new NorthwindEntities()) { var order = ctx.Orders .FirstOrDefault(p => p.OrderID == orderId); if (order != null) ctx.Orders.Remove(order); ctx.SaveChanges(); } } Add the following method, using the methods added earlier, to the class: public static void UseEFConnectionExample() { var example = new EFConnectionExample(); var customers=example.NamesOfCustomerCompanies(); foreach (var customer in customers) { Console.WriteLine(customer); } example.UpdateItalians(); var order=example.FirstItalianPlaceOrder(); example.RemoveTheFunnyOrder(order); } How it works… This recipe uses EF to connect and operate on a SQL Database. In step 1, we create a class that contains the recipe, and in step 2, we open the wizard for the creation of Entity Data Model (EDMX). We create the model, starting from an existing database in step 3 (it is also possible to write our own model and then persist it in an empty database), and then, we select the connection in step 4. In fact, there is no reference in the entire code to the Windows Azure SQL Database. The only reference should be in the App.config settings created in step 5; this can be changed to point to a SQL Server instance, leaving the code untouched. The last step of the EDMX creation consists of concrete mapping between the relational table and the object model, as shown in step 6. This method generates the code classes that map the table schema, using strong types and collections referred to as Navigation properties. It is also possible to start from the code, writing the classes that could represent the database schema. This method is known as Code-First. In step 7, we ask for every CompanyName of the Customers table. Every table in EF is represented by DbSet<Type>, where Type is the class of the entity. In steps 7 and 8, Customers is DbSet<Customers>, and we use a lambda expression to project (select) a property field and another one to create a filter (where) based on a property value. The SaveChanges method in step 8 persists to the database the changes detected in the disconnected object data model. This magic is one of the purposes of an ORM tool. In step 9, we use the navigation property (relationship) between a Customers object and the Orders collection (table) to add a new order with sample data. We use the OrderBy extension method to order the results by the specified property, and finally, we save the newly created item. Even now, EF automatically keeps track of the newly added item. Additionally, after the SaveChanges method, EF populates the identity field of Order (OrderID) with the actual value created by the database engine. In step 10, we use the previously obtained OrderID to remove the corresponding order from the database. We use the FirstOrDefault() method to test the existence of the ID, and then, we remove the resulting object like we removed an object from a plain old collection. In step 11, we use the methods created to run the demo and show the results. Deploying a Website Creating a Website is an administrative task, which is performed in the Azure Portal in the same way we provision every other building block. The Website created is like a "deployment slot", or better, "web space", since the abstraction given to the user is exactly that. Azure Websites does not require additional knowledge compared to an old-school hosting provider, where FTP was the standard for the deployment process. Actually, FTP is just one of the supported deployment methods in Websites, since Web Deploy is probably the best choice for several scenarios. Web Deploy is a Microsoft technology used for copying files and provisioning additional content and configuration to integrate the deployment process. Web Deploy runs on HTTP and HTTPS with basic (username and password) authentication. This makes it a good choice in networks where FTP is forbidden or the firewall rules are strict. Some time ago, Microsoft introduced the concept of Publish Profile, an XML file containing all the available deployment endpoints of a particular website that, if given to Visual Studio or Web Matrix, could make the deployment easier. Every Azure Website comes with a publish profile with unique credentials, so one can distribute it to developers without giving them grants on the Azure Subscription. Web Matrix is a client tool of Microsoft, and it is useful to edit live sites directly from an intuitive GUI. It uses Web Deploy to provide access to the remote filesystem as to perform remote changes. In Websites, we can host several websites on the same server farm, making administration easier and isolating the environment from the neighborhood. Moreover, virtual directories can be defined from the Azure Portal, enabling complex scenarios or making migrations easier. In this recipe, we will cope with the deployment process, using FTP and Web Deploy with some variants. Getting ready This recipe assumes we have and FTP client installed on the local machine (for example, FileZilla) and, of course, a valid Azure Subscription. We also need Visual Studio 2013 with the latest Azure SDK installed (at the time of writing, SDK Version 2.3). How to do it… We are going to create a new Website, create a new ASP.NET project, deploy it through FTP and Web Deploy, and also use virtual directories. We do this as follows: Create a new Website in the Azure Portal, specifying the following details: The URL prefix (that is, TestWebSite) is set to [prefix].azurewebsites.net The Web Hosting Plan (create a new one) The Region/Location (select West Europe) Click on the newly created Website and go to the Dashboard tab. Click on Download the publish profile and save it on the local computer. Open Visual Studio and create a new ASP.NET web application named TestWebSite, with an empty template and web forms' references. Add a sample Default.aspx page to the project and paste into it the following HTML: <h1>Root Application</h1> Press F5 and test whether the web application is displayed correctly. Create a local publish target. Right-click on the project and select Publish. Select Custom and specify Local Folder. In the Publish method, select File System and provide a local folder where Visual Studio will save files. Then click on Publish to complete. Publish via FTP. Open FileZilla and then open the Publish profile (saved in step 3) with a text editor. Locate the FTP endpoint and specify the following: publishUrl as the Host field username as the Username field userPWD as the Password field Delete the hostingstart.html file that is already present on the remote space. When we create a new Azure Website, there is a single HTML file in the root folder by default, which is served to the clients as the default page. By leaving it in the Website, the file could be served after users' deployments as well if no valid default documents are found. Drag-and-drop all the contents of the local folder with the binaries to the remote folder, then run the website. Publish via Web Deploy. Right-click on the Project and select Publish. Go to the Publish Web wizard start and select Import, providing the previously downloaded Publish Profile file. When Visual Studio reads the Web Deploy settings, it populates the next window. Click on Confirm and Publish the web application. Create an additional virtual directory. Go to the Configure tab of the Website on the Azure Portal. At the bottom, in the virtual applications and directories, add the following: /app01 with the path siteapp01 Mark it as Application Open the Publish Profile file and duplicate the <publishProfile> tag with the method FTP, then edit the following: Add the suffix App01 to profileName Replace wwwroot with app01 in publishUrl Create a new ASP.NET web application called TestWebSiteApp01 and create a new Default.aspx page in it with the following code: <h1>App01 Application</h1> Right-click on the TestWebSiteApp01 project and Publish. Select Import and provide the edited Publish Profile file. In the first step of the Publish Web wizard (go back if necessary), select the App01 method and select Publish. Run the Website's virtual application by appending the /app01 suffix to the site URL. How it works... In step 1, we create the Website on the Azure Portal, specifying the minimal set of parameters. If the existing web hosting plan is selected, the Website will start in the specified tier. In the recipe, by specifying a new web hosting plan, the Website is created in the free tier with some limitations in configuration. The recipe uses the Azure Portal located at https://manage.windowsazure.com. However, the new Azure Portal will be at https://portal.azure.com. New features will be probably added only in the new Portal. In steps 2 and 3, we download the Publish Profile file, which is an XML containing the various endpoints to publish the Website. At the time of writing, Web Deploy and FTP are supported by default. In steps 4, 5, and 6, we create a new ASP.NET web application with a sample ASPX page and run it locally. In steps 7, 8, and 9, we publish the binaries of the Website, without source code files, into a local folder somewhere in the local machine. This unit of deployment (the folder) can be sent across the wire via FTP, as we do in steps 10 to 13 using the credentials and the hostname available in the Publish Profile file. In steps 14 to 16, we use the Publish Profile file directly from Visual Studio, which recognizes the different methods of deployment and suggests Web Deploy as the default one. If we perform the steps 10-13, with steps14-16 we overwrite the existing deployment. Actually, Web Deploy compares the target files with the ones to deploy, making the deployment incremental for those file that have been modified or added. This is extremely useful to avoid unnecessary transfers and to save bandwidth. In steps 17 and 18, we configure a new Virtual Application, specifying its name and location. We can use an FTP client to browse the root folder of a website endpoint, since there are several folders such as wwwroot, locks, diagnostics, and deployments. In step 19, we manually edit the Publish Profile file to support a second FTP endpoint, pointing to the new folder of the Virtual Application. Visual Studio will correctly understand this while parsing the file again in step 22, showing the new deployment option. Finally, we verify whether there are two applications: one on the root folder / and one on the /app01 alias. There's more… Suppose we need to edit the website on the fly, editing a CSS of JS file or editing the HTML somewhere. We can do this using Web Matrix, which is available from the Azure Portal itself through a ClickOnce installation: Go to the Dashboard tab of the Website and click on WebMatrix at the bottom. Follow the instructions to install the software (if not yet installed) and, when it opens, select Edit live site directly (the magic is done through the Publish Profile file and Web Deploy). In the left-side tree, edit the Default.aspx file, and then save and run the Website again. Azure Websites gallery Since Azure Websites is a PaaS service, with no lock-in or particular knowledge or framework required to run it, it can hosts several Open Source CMS in different languages. Azure provides a set of built-in web applications to choose while creating a new website. This is probably not the best choice for production environments; however, for testing or development purposes, it should be a faster option than starting from scratch. Wizards have been, for a while, the primary resources for developers to quickly start off projects and speed up the process of creating complex environments. However, the Websites gallery creates instances of well-known CMS with predefined configurations. Instead, production environments are manually crafted, customizing each aspect of the installation. To create a new Website using the gallery, proceed as follows: Create a new Website, specifying from gallery. Select the web application to deploy and follow the optional configuration steps. If we create some resources (like databases) while using the gallery, they will be linked to the site in the Linked Resources tab. Building a simple cache for applications Azure Cache is a managed service with (at the time of writing this book) the following three offerings: Basic: This service has a unit size of 128 MB, up to 1 GB with one named cache (the default one) Standard: This service has a unit size of 1 GB, up to 10 GB with 10 named caches and support for notifications Premium: This service has a unit size of 5 GB, up to 150 GB with ten named caches, support for notifications, and high availability Different offerings have different unit prices, and remember that when changing from one offering to another, all the cache data is lost. In all offerings, users can define the items' expiration. The Cache service listens to a specific TCP port. Accessing it from a .NET application is quite simple, with the Microsoft ApplicationServer Caching library available on NuGet. In the Microsoft.ApplicationServer.Caching namespace, the following are all the classes that are needed to operate: DataCacheFactory: This class is responsible for instantiating the Cache proxies to interpret the configuration settings. DataCache: This class is responsible for the read/write operation against the cache endpoint. DataCacheFactoryConfiguration: This is the model class of the configuration settings of a cache factory. Its usage is optional as cache can be configured in the App/Web.config file in a specific configuration section. Azure Cache is a key-value cache. We can insert and even get complex objects with arbitrary tree depth using string keys to locate them. The importance of the key is critical, as in a single named cache, only one object can exist for a given key. The architects and developers should have the proper strategy in place to deal with unique (and hierarchical) names. Getting ready This recipe assumes that we have a valid Azure Cache endpoint of the standard type. We need the standard type because we use multiple named caches, and in later recipes, we use notifications. We can create a Standard Cache endpoint of 1 GB via PowerShell. Perform the following steps to create the Standard Cache endpoint : Open the Azure PowerShell and type Add-AzureAccount. A popup window might appear. Type your credentials connected to a valid Azure subscription and continue. Optionally, select the proper Subscription, if not the default one. Type this command to create a new Cache endpoint, replacing myCache with the proper unique name: New-AzureManagedCache -Name myCache -Location "West Europe" -Sku Standard -Memory 1GB After waiting for some minutes until the endpoint is ready, go to the Azure Portal and look for the Manage Keys section to get one of the two Access Keys of the Cache endpoint. In the Configure section of the Cache endpoint, a cache named default is created by default. In addition, create two named caches with the following parameters: Expiry Policy: Absolute Time: 10 Notifications: Enabled Expiry Policy could be Absolute (the default expiration time or the one set by the user is absolute, regardless of how many times the item has been accessed), Sliding (each time the item has been accessed, the expiration timer resets), or Never (items do not expire). This Azure Cache endpoint is now available in the Management Portal, and it will be used in the entire article. How to do it… We are going to create a DataCache instance through a code-based configuration. We will perform simple operations with Add, Get, Put, and Append/Prepend, using a secondary-named cache to transfer all the contents of the primary one. We will do this by performing the following steps: Add a new class named BuildingSimpleCacheExample to the project. Install the Microsoft.WindowsAzure.Caching NuGet package. Add the following using statement to the top of the class file: using Microsoft.ApplicationServer.Caching; Add the following private members to the class: private DataCacheFactory factory = null; private DataCache cache = null; Add the following constructor to the class: public BuildingSimpleCacheExample(string ep, string token,string cacheName) { DataCacheFactoryConfiguration config = new DataCacheFactoryConfiguration(); config.AutoDiscoverProperty = new DataCacheAutoDiscoverProperty(true, ep); config.SecurityProperties = new DataCacheSecurity(token, true); factory = new DataCacheFactory(config); cache = factory.GetCache(cacheName); } Add the following method, creating a palindrome string into the cache: public void CreatePalindromeInCache() { var objKey = "StringArray"; cache.Put(objKey, ""); char letter = 'A'; for (int i = 0; i < 10; i++) { cache.Append(objKey, char.ConvertFromUtf32((letter+i))); cache.Prepend(objKey, char.ConvertFromUtf32((letter + i))); } Console.WriteLine(cache.Get(objKey)); } Add the following method, adding an item into the cache to analyze its subsequent retrievals: public void AddAndAnalyze() { var randomKey = DateTime.Now.Ticks.ToString(); var value="Cached string"; cache.Add(randomKey, value); DataCacheItem cacheItem = cache.GetCacheItem(randomKey); Console.WriteLine(string.Format( "Item stored in {0} region with {1} expiration", cacheItem.RegionName,cacheItem.Timeout)); cache.Put(randomKey, value, TimeSpan.FromSeconds(60)); cacheItem = cache.GetCacheItem(randomKey); Console.WriteLine(string.Format( "Item stored in {0} region with {1} expiration", cacheItem.RegionName, cacheItem.Timeout)); var version = cacheItem.Version; var obj = cache.GetIfNewer(randomKey, ref version); if (obj == null) { //No updates } } Add the following method, transferring the contents of the cache named initially into a second one: public void BackupToDestination(string destCacheName) { var destCache = factory.GetCache(destCacheName); var dump = cache.GetSystemRegions() .SelectMany(p => cache.GetObjectsInRegion(p)) .ToDictionary(p=>p.Key,p=>p.Value); foreach (var item in dump) { destCache.Put(item.Key, item.Value); } } Add the following method to clear the cache named first: public void ClearCache() { cache.Clear(); } Add the following method, using the methods added earlier, to the class: public static void RunExample() { var cacheName = "[named cache 1]"; var backupCache = "[named cache 2]"; string endpoint = "[cache endpoint]"; string token = "[cache token/key]"; BuildingSimpleCacheExample example = new BuildingSimpleCacheExample(endpoint, token, cacheName); example.CreatePalindromeInCache(); example.AddAndAnalyze(); example.BackupToDestination(backupCache); example.ClearCache(); } How it works... From steps 1 to 3, we set up the class. In step 4, we add private members to store the DataCacheFactory object used to create the DataCache object to access the Cache service. In the constructor that we add in step 5, we initialize the DataCacheFactory object using a configuration model class (DataCacheFactoryConfiguration). This strategy is for code-based initialization whenever settings cannot stay in the App.config/Web.config file. In step 6, we use the Put() method to write an empty string into the StringArray bucket. We then use the Append() and Prepend() methods, designed to concatenate strings to existing strings, to build a palindrome string in the memory cache. This sample does not make any sense in real-world scenarios, and we must pay attention to some of the following issues: Writing an empty string into the cache is somehow useless. Each Append() or Prepend() operation travels on TCP to the cache and goes back. Though it is very simple, it requires resources, and we should always try to consolidate calls. In step 7, we use the Add() method to add a string to the cache. The difference between the Add() and Put() methods is that the first method throws an exception if the item already exists, while the second one always overwrites the existing value (or writes it for the first time). GetCacheItem() returns a DataCacheItem object, which wraps the value together with other metadata properties, such as the following: CacheName: This is the named cache where the object is stored. Key: This is the key of the associated bucket. RegionName (user defined or system defined): This is the region of the cache where the object is stored. Size: This is the size of the object stored. Tags: These are the optional tags of the object, if it is located in a user-defined region. Timeout: This is the current timeout before the object would expire. Version: This is the version of the object. This is a DataCacheItemVersion object whose properties are not accessible due to their modifier. However, it is not important to access this property, as the Version object is used as a token against the Cache service to implement the optimistic concurrency. As for the timestamp value, its semantic can stay hidden from developers. The first Add() method does not specify a timeout for the object, leaving the default global expiration timeout, while the next Put() method does, as we can check in the next Get() method. We finally ask the cache about the object with the GetIfNewer() method, passing the latest version token we have. This conditional Get method returns null if the object we own is already the latest one. In step 8, we list all the keys of the first named cache, using the GetSystemRegions() method (to first list the system-defined regions), and for each region, we ask for their objects, copying them into the second named cache. In step 9, we clear all the contents of the first cache. In step 10, we call the methods added earlier, specifying the Cache endpoint to connect to and the token/password, along with the two named caches in use. Replace [named cache 1], [named cache 2], [cache endpoint], and [cache token/key] with actual values. There's more… Code-based configuration is useful when the settings stay in a different place as compared to the default config files for .NET. It is not a best practice to hardcode them, so this is the standard way to declare them in the App.config file: <configSections> <section name="dataCacheClients" type="Microsoft.ApplicationServer.Caching.DataCacheClientsSection, Microsoft.ApplicationServer.Caching.Core" allowLocation="true" allowDefinition="Everywhere" /> </configSections> The XML mentioned earlier declares a custom section, which should be as follows: <dataCacheClients> <dataCacheClient name="[name of cache]"> <autoDiscover isEnabled="true" identifier="[domain of cache]" /> <securityProperties mode="Message" sslEnabled="true"> <messageSecurity authorizationInfo="[token of endpoint]" /> </securityProperties> </dataCacheClient> </dataCacheClients> In the upcoming recipes, we will use this convention to set up the DataCache objects. ASP.NET Support With almost no effort, the Azure Cache can be used as Output Cache in ASP.NET to save the session state. To enable this, in addition to the configuration mentioned earlier, we need to include those declarations in the <system.web> section as follows: <sessionState mode="Custom" customProvider="AFCacheSessionStateProvider"> <providers> <add name="AFCacheSessionStateProvider" type="Microsoft.Web.DistributedCache.DistributedCacheSessionStateStoreProvider, Microsoft.Web.DistributedCache" cacheName="[named cache]" dataCacheClientName="[name of cache]" applicationName="AFCacheSessionState"/> </providers> </sessionState> <caching> <outputCache defaultProvider="AFCacheOutputCacheProvider"> <providers> <add name="AFCacheOutputCacheProvider" type="Microsoft.Web.DistributedCache.DistributedCacheOutputCacheProvider, Microsoft.Web.DistributedCache" cacheName="[named cache]" dataCacheClientName="[name of cache]" applicationName="AFCacheOutputCache" /> </providers> </outputCache> </caching> The difference between [name of cache] and [named cache] is as follows: The [name of cache] part is a friendly name of the cache client declared above an alias. The [named cache] part is the named cache created into the Azure Cache service. Connecting to the Azure Storage service In an Azure Cloud Service, the storage account name and access key are stored in the service configuration file. By convention, the account name and access key for data access are provided in a setting named DataConnectionString. The account name and access key needed for Azure Diagnostics must be provided in a setting named Microsoft.WindowsAzure.Plugins.Diagnostics.ConnectionString. The DataConnectionString setting must be declared in the ConfigurationSettings section of the service definition file. However, unlike other settings, the connection string setting for Azure Diagnostics is implicitly defined when the Diagnostics module is specified in the Imports section of the service definition file. Consequently, it must not be specified in the ConfigurationSettings section. A best practice is to use different storage accounts for application data and diagnostic data. This reduces the possibility of application data access being throttled by competition for concurrent writes from the diagnostics monitor. What is Throttling? In shared services, where the same resources are shared between tenants, limiting the concurrent access to them is critical to provide service availability. If a client misuses the service or, better, generates a huge amount of traffic, other tenants pointing to the same shared resource could experience unavailability. Throttling (also known as Traffic Control plus Request Cutting) is one of the most adopted solutions that is solving this issue. It also provides a security boundary between application data and diagnostics data, as diagnostics data might be accessed by individuals who should have no access to application data. In the Azure Storage library, access to the storage service is through one of the Client classes. There is one Client class for each Blob service, Queue service, and Table service; they are CloudBlobClient, CloudQueueClient, and CloudTableClient, respectively. Instances of these classes store the pertinent endpoint as well as the account name and access key. The CloudBlobClient class provides methods to access containers, list their contents, and get references to containers and blobs. The CloudQueueClient class provides methods to list queues and get a reference to the CloudQueue instance that is used as an entry point to the Queue service functionality. The CloudTableClient class provides methods to manage tables and get the TableServiceContext instance that is used to access the WCF Data Services functionality while accessing the Table service. Note that the CloudBlobClient, CloudQueueClient, and CloudTableClient instances are not thread safe, so distinct instances should be used when accessing these services concurrently. The client classes must be initialized with the account name, access key, as well as the appropriate storage service endpoint. The Microsoft.WindowsAzure namespace has several helper classes. The StorageCredential class initializes an instance from an account name and access key or from a shared access signature. In this recipe, we'll learn how to use the CloudBlobClient, CloudQueueClient, and CloudTableClient instances to connect to the storage service. Getting ready This recipe assumes that the application's configuration file contains the following: <appSettings> <add key="DataConnectionString" value="DefaultEndpointsProtocol=https;AccountName={ACCOUNT_NAME};AccountKey={ACCOUNT_KEY}"/> <add key="AccountName" value="{ACCOUNT_NAME}"/> <add key="AccountKey" value="{ACCOUNT_KEY}"/> </appSettings> We must replace {ACCOUNT_NAME} and {ACCOUNT_KEY} with appropriate values for the storage account name and access key, respectively. We are not working in a Cloud Service but in a simple console application. Storage services, like many other building blocks of Azure, can also be used separately from on-premise environments. How to do it... We are going to connect to the Table service, the Blob service, and the Queue service, and perform a simple operation on each. We will do this using the following steps: Add a new class named ConnectingToStorageExample to the project. Add the following using statements to the top of the class file: using Microsoft.WindowsAzure.Storage; using Microsoft.WindowsAzure.Storage.Blob; using Microsoft.WindowsAzure.Storage.Queue; using Microsoft.WindowsAzure.Storage.Table; using Microsoft.WindowsAzure.Storage.Auth; using System.Configuration; The System.Configuration assembly should be added via the Add Reference action onto the project, as it is not included in most of the project templates of Visual Studio. Add the following method, connecting the blob service, to the class: private static void UseCloudStorageAccountExtensions() { CloudStorageAccount cloudStorageAccount = CloudStorageAccount.Parse( ConfigurationManager.AppSettings[ "DataConnectionString"]); CloudBlobClient cloudBlobClient = cloudStorageAccount.CreateCloudBlobClient(); CloudBlobContainer cloudBlobContainer = cloudBlobClient.GetContainerReference( "{NAME}"); cloudBlobContainer.CreateIfNotExists(); } Add the following method, connecting the Table service, to the class: private static void UseCredentials() { string accountName = ConfigurationManager.AppSettings[ "AccountName"]; string accountKey = ConfigurationManager.AppSettings[ "AccountKey"]; StorageCredentials storageCredentials = new StorageCredentials( accountName, accountKey); CloudStorageAccount cloudStorageAccount = new CloudStorageAccount(storageCredentials, true); CloudTableClient tableClient = new CloudTableClient( cloudStorageAccount.TableEndpoint, storageCredentials); CloudTable table = tableClient.GetTableReference("{NAME}"); table.CreateIfNotExists(); } Add the following method, connecting the Queue service, to the class: private static void UseCredentialsWithUri() { string accountName = ConfigurationManager.AppSettings[ "AccountName"]; string accountKey = ConfigurationManager.AppSettings[ "AccountKey"]; StorageCredentials storageCredentials = new StorageCredentials( accountName, accountKey); StorageUri baseUri = new StorageUri(new Uri(string.Format( "https://{0}.queue.core.windows.net/", accountName))); CloudQueueClient cloudQueueClient = new CloudQueueClient(baseUri, storageCredentials); CloudQueue cloudQueue = cloudQueueClient.GetQueueReference("{NAME}"); cloudQueue.CreateIfNotExists(); } Add the following method, using the other methods, to the class: public static void UseConnectionToStorageExample() { UseCloudStorageAccountExtensions(); UseCredentials(); UseCredentialsWithUri(); } How it works... In steps 1 and 2, we set up the class. In step 3, we implement the standard way to access the storage service using the Storage Client library. We use the static CloudStorageAccount.Parse() method to create a CloudStorageAccount instance from the value of the connection string stored in the configuration file. We then use this instance with the CreateCloudBlobClient() extension method of the CloudStorageAccount class to get the CloudBlobClient instance that we use to connect to the Blob service. We can also use this technique with the Table service and the Queue service, using the relevant extension methods, CreateCloudTableClient() and CreateCloudQueueClient(), respectively, for them. We complete this example using the CloudBlobClient instance to get a CloudBlobContainer reference to a container and then create it if it does not exist We need to replace {NAME} with the name for a container. In step 4, we create a StorageCredentials instance directly from the account name and access key. We then use this to construct a CloudStorageAccount instance, specifying that any connection should use HTTPS. Using this technique, we need to provide the Table service endpoint explicitly when creating the CloudTableClient instance. We then use this to create the table. We need to replace {NAME} with the name of a table. We can use the same technique with the Blob service and Queue service using the relevant CloudBlobClient or CloudQueueClient constructor. In step 5, we use a similar technique, except that we avoid the intermediate step of using a CloudStorageAccount instance and explicitly provide the endpoint for the Queue service. We use the CloudQueueClient instance created in this step to create the queue. We need to replace {NAME} with the name of a queue. Note that we hardcoded the endpoint for the Queue service. Though this last method is officially supported, it is not a best practice to bind our code to hardcoded strings with endpoint URIs. So, it is preferable to use one of the previous methods that hides the complexity of the URI generation at the library level. In step 6, we add a method that invokes the methods added in the earlier steps. There's more… With the general availability of the .NET Framework Version 4.5, many libraries of the CLR have been added with the support of asynchronous methods with the Async/Await pattern. Latest versions of the Azure Storage Library also have these overloads, which are useful while developing mobile applications, and fast web APIs. They are generally useful when it is needed to combine the task execution model into our applications. Almost each long-running method of the library has its corresponding methodAsync() method to be called as follows: await cloudQueue.CreateIfNotExistsAsync(); In the rest of the book, we will continue to use the standard, synchronous pattern. Adding messages to a Storage queue The CloudQueue class in the Azure Storage library provides both synchronous and asynchronous methods to add a message to a queue. A message comprises up to 64 KB bytes of data (48 KB if encoded in Base64). By default, the Storage library Base64 encodes message content to ensure that the request payload containing the message is valid XML. This encoding adds overhead that reduces the actual maximum size of a message. A message for a queue should not be intended to transport a big payload, since the purpose of a Queue is just messaging and not storing. If required, a user can store the payload in a Blob and use a Queue message to point to that, letting the receiver fetch the message along with the Blob from its remote location. Each message added to a queue has a time-to-live property after which it is deleted automatically. The maximum and default time-to-live value is 7 days. In this recipe, we'll learn how to add messages to a queue. Getting ready This recipe assumes the following code is in the application configuration file: <appSettings> <add key="DataConnectionString" value="DefaultEndpointsProtocol=https;AccountName={ACCOUNT_NAME};AccountKey={ACCOUNT_KEY}"/> </appSettings> We must replace {ACCOUNT_NAME} and {ACCOUNT_KEY} with appropriate values of the account name and access key. How to do it... We are going to create a queue and add some messages to it. We do this as follows: Add a new class named AddMessagesOnStorageExample to the project. Install the WindowsAzure.Storage NuGet package and add the following assembly references to the project: System.Configuration Add the following using statements to the top of the class file: using Microsoft.WindowsAzure.Storage; using Microsoft.WindowsAzure.Storage.Queue; using System.Configuration; Add the following private member to the class: private CloudQueue cloudQueueClient; Add the following constructor to the class: public AddMessagesOnStorageExample(String queueName) { CloudStorageAccount cloudStorageAccount = CloudStorageAccount.Parse( ConfigurationManager.AppSettings[ "DataConnectionString"]); CloudQueueClient cloudQueueClient = cloudStorageAccount.CreateCloudQueueClient(); cloudQueue = cloudQueueClient.GetQueueReference(queueName); cloudQueue.CreateIfNotExists(); } Add the following method to the class, adding two messages: public void AddMessages() { String content1 = "Do something"; CloudQueueMessage message1 = new CloudQueueMessage(content1); cloudQueue.AddMessage(message1); String content2 = "Do something that expires in 1 day"; CloudQueueMessage message2 = new CloudQueueMessage(content2); cloudQueue.AddMessage(message2, TimeSpan.FromDays(1.0)); String content3 = "Do something that expires in 2 hours,"+ " starting in 1 hour from now"; CloudQueueMessage message3 = new CloudQueueMessage(content3); cloudQueue.AddMessage(message2, TimeSpan.FromHours(2),TimeSpan.FromHours(1)); } Add the following method, that uses the AddMessage() method, to the class: public static void UseAddMessagesExample() { String queueName = "{QUEUE_NAME}"; AddMessagesOnStorageExample example = new AddMessagesOnStorageExample (queueName); example.AddMessages(); } How it works... In steps 1 through 3, we set up the class. In step 4, we add a private member to store the CloudQueue object used to interact with the Queue service. We initialize this in the constructor we add in step 5 where we also create the queue. In step 6, we add a method that adds three messages to a queue. We create three CloudQueueMessage objects. We add the first message to the queue with the default time-to-live of seven days, the second is added specifying an expiration of 1 day, and the third will become visible after 1 hour since its entrance in the queue, with an absolute expiration of 2 hours. Note that a client (library) exception is thrown if we specify a visibility delay higher than the absolute TTL of the message. This is naturally obvious and it is enforced at the client side, instead making a (failing) server call. In step 7, we add a method that invokes the methods we added earlier. We need to replace {QUEUE_NAME} with an appropriate name for a queue. There's more… To clear the queue from the messages we added in this recipe, we can proceed by calling the Clear() method in the CloudQueue class as follows: public void ClearQueue() { cloudQueue.Clear(); } Summary In this article, we have learned some of the recipes in order to build a complete overview of the software infrastructure that we need to set up on the cloud. Resources for Article: Further resources on this subject: Backups in the VMware View Infrastructure [Article] vCloud Networks [Article] Setting Up a Test Infrastructure [Article]
Read more
  • 0
  • 0
  • 1996

article-image-event-driven-bpel-process
Packt
19 Sep 2014
5 min read
Save for later

Event-driven BPEL Process

Packt
19 Sep 2014
5 min read
In this article, by Matjaz B. Juric and Denis Weerasiri, authors of the book WS-BPEL 2.0 Beginner's Guide, we will study about the event-driven BPEL process. We will learn to develop a book shelving BPEL process. (For more resources related to this topic, see here.) Developing an event-driven BPEL process Firstly, we will develop an event-driven BPEL process. This is a BPEL process triggered by a business event. We will develop a process for book shelving. As we have already mentioned, such a process can be executed on various occasions, such as when a book arrives to the bookstore for the first time, after a customer has looked at the book, or even during an inventory. In contrast to a BPEL process, which exposes an operation that needs to be invoked explicitly, our book shelving process will react on a business event. We will call it a BookshelfEvent. We can see that in order to develop an event-driven BPEL process, we will need to firstly declare a business event, the BookshelfEvent. Following this, we will need to develop the event-driven book shelving BPEL process. Declaring a business event We will declare the BookshelfEvent business event, which will signal that a book is ready to be book shelved. Each business event contains a data payload, which is defined by the corresponding XML schema type. In our case, we will use the BookData type, the same one that we used in the book warehousing process. Time for action – declaring a business event To declare the BookshelfEvent business event, we will go to the composite view. We will proceed as follows: Right-click on the project in the Application window and select New and then Event Definition: A Create Event Definition dialog box will open. We will specify the EDL filename. This is the file where all the events are defined (similar to WSDL, where the web service operations are defined). We will use the BookEDL for the EDL filename. For the Namespace field, we will use http://packtpub.com/events/edl/BookEDL, as shown in the following screenshot: Next, we need to define the business events. We will use the green plus sign to declare the BookshelfEvent business event. After clicking on the green plus sign, the Create Event dialog box will open. We need to specify the event name, which is BookshelfEvent. We also have to specify the XML Type, which will be used for the event data payload. We will use the BookData from the Book Warehousing BPEL process schema, as shown in the following screenshot: After clicking on the OK button, we should see the following: What just happened? We have successfully declared the BookshelfEvent business event. This has generated the BookEDL.edl file with the source code as shown in the following screenshot: Developing a book shelving BPEL process After declaring the business event, we are ready to develop the event-driven book shelving BPEL process. The process will be triggered by our BookshelfEvent business event. This means that the process will not have a classic WSDL with the operation declaration. Rather it will be triggered by the BookshelfEvent business event. Time for action – developing an event-driven book shelving BPEL process To develop the event-driven book shelving BPEL process, we will go to the composite view. We will carry out the following steps: Drag-and-drop the BPEL Process service component from the right-hand side toolbar to the composite components area. The Create BPEL Process dialog box will open. We will select the BPEL 2.0 Specification, type BookShelvingBPEL for the Name of the process, and specify the namespace as http://packtpub.com/Bookstore/BookShelvingBPEL. Then, we will select Subscribe to Events from the drop-down list for the Template: Next, we will need to specify the event to which our BPEL process will be subscribed. We will select the green plus sign and the Event Chooser dialog window will open. Here, we will simply select the BookshelfEvent business event: After clicking on the OK button, we should see the following screenshot: For event-driven BPEL processes, three consistency strategies for delivering events exist. The one and only one option delivers the events in the global transaction. Guaranteed delivers events asynchronously without a global transaction. Immediate delivers events in the same global transaction and the same thread as the publisher, and the publish call does not return until all immediate subscribers have completed processing. After clicking on the OK button, we can find the new BookShelvingBPEL process on the composite diagram. Please note that the arrow icon denotes that the process is triggered by an event: Double-clicking on the BookShelvingBPEL process opens the BPEL editor, where we can see that the BPEL process has a slightly different <receive> activity, which denotes that the process will be triggered by an event. Also, notice that an event-driven process does not return anything to the client, as event-driven processes are one-way and asynchronous: What just happened? We have successfully created the BookShelvingBPEL process. Looking at the source code we can see that the overall structure is the same as with any other BPEL process. The difference is in the initial <receive> activity, which is triggered by the BookshelfEvent business event, as shown in the following screenshot: Have a go hero – implementing the BookShelvingBPEL process Implementing the event-driven BookShelvingBPEL process does not differ from implementing any other BPEL process. Therefore, it's your turn now. You should implement the BookShelvingBPEL process to do something meaningful. It could, for example, call a service which will query a database table. Or, it could include a human task. Summary In this article, we learned how to develop event-driven BPEL processes and how to invoke events from BPEL processes. We also learned to implement the BookShelvingBPEL process. Resources for Article: Further resources on this subject: Securing a BPEL process [Article] Scopes in Advanced BPEL [Article] Advanced Activities in BPEL [Article]
Read more
  • 0
  • 0
  • 1301

article-image-redis-autosuggest
Packt
18 Sep 2014
8 min read
Save for later

Redis in Autosuggest

Packt
18 Sep 2014
8 min read
In this article by Arun Chinnachamy, the author of Redis Applied Design Patterns, we are going to see how to use Redis to build a basic autocomplete or autosuggest server. Also, we will see how to build a faceting engine using Redis. To build such a system, we will use sorted sets and operations involving ranges and intersections. To summarize, we will focus on the following topics in this article: (For more resources related to this topic, see here.) Autocompletion for words Multiword autosuggestion using a sorted set Faceted search using sets and operations such as union and intersection Autosuggest systems These days autosuggest is seen in virtually all e-commerce stores in addition to a host of others. Almost all websites are utilizing this functionality in one way or another from a basic website search to programming IDEs. The ease of use afforded by autosuggest has led every major website from Google and Amazon to Wikipedia to use this feature to make it easier for users to navigate to where they want to go. The primary metric for any autosuggest system is how fast we can respond with suggestions to a user's query. Usability research studies have found that the response time should be under a second to ensure that a user's attention and flow of thought are preserved. Redis is ideally suited for this task as it is one of the fastest data stores in the market right now. Let's see how to design such a structure and use Redis to build an autosuggest engine. We can tweak Redis to suit individual use case scenarios, ranging from the simple to the complex. For instance, if we want only to autocomplete a word, we can enable this functionality by using a sorted set. Let's see how to perform single word completion and then we will move on to more complex scenarios, such as phrase completion. Word completion in Redis In this section, we want to provide a simple word completion feature through Redis. We will use a sorted set for this exercise. The reason behind using a sorted set is that it always guarantees O(log(N)) operations. While it is commonly known that in a sorted set, elements are arranged based on the score, what is not widely acknowledged is that elements with the same scores are arranged lexicographically. This is going to form the basis for our word completion feature. Let's look at a scenario in which we have the words to autocomplete: jack, smith, scott, jacob, and jackeline. In order to complete a word, we need to use n-gram. Every word needs to be written as a contiguous sequence. n-gram is a contiguous sequence of n items from a given sequence of text or speech. To find out more, check http://en.wikipedia.org/wiki/N-gram. For example, n-gram of jack is as follows: j ja jac jack$ In order to signify the completed word, we can use a delimiter such as * or $. To add the word into a sorted set, we will be using ZADD in the following way: > zadd autocomplete 0 j > zadd autocomplete 0 ja > zadd autocomplete 0 jac > zadd autocomplete 0 jack$ Likewise, we need to add all the words we want to index for autocompletion. Once we are done, our sorted set will look as follows: > zrange autocomplete 0 -1 1) "j" 2) "ja" 3) "jac" 4) "jack$" 5) "jacke" 6) "jackel" 7) "jackeli" 8) "jackelin" 9) "jackeline$" 10) "jaco" 11) "jacob$" 12) "s" 13) "sc" 14) "sco" 15) "scot" 16) "scott$" 17) "sm" 18) "smi" 19) "smit" 20) "smith$" Now, we will use ZRANK and ZRANGE operations over the sorted set to achieve our desired functionality. To autocomplete for ja, we have to execute the following commands: > zrank autocomplete jac 2 zrange autocomplete 3 50 1) "jack$" 2) "jacke" 3) "jackel" 4) "jackeli" 5) "jackelin" 6) "jackeline$" 7) "jaco" 8) "jacob$" 9) "s" 10) "sc" 11) "sco" 12) "scot" 13) "scott$" 14) "sm" 15) "smi" 16) "smit" 17) "smith$" Another example on completing smi is as follows: zrank autocomplete smi 17 zrange autocomplete 18 50 1) "smit" 2) "smith$" Now, in our program, we have to do the following tasks: Iterate through the results set. Check if the word starts with the query and only use the words with $ as the last character. Though it looks like a lot of operations are performed, both ZRANGE and ZRANK are O(log(N)) operations. Therefore, there should be virtually no problem in handling a huge list of words. When it comes to memory usage, we will have n+1 elements for every word, where n is the number of characters in the word. For M words, we will have M(avg(n) + 1) records where avg(n) is the average characters in a word. The more the collision of characters in our universe, the less the memory usage. In order to conserve memory, we can use the EXPIRE command to expire unused long tail autocomplete terms. Multiword phrase completion In the previous section, we have seen how to use the autocomplete for a single word. However, in most real world scenarios, we will have to deal with multiword phrases. This is much more difficult to achieve as there are a few inherent challenges involved: Suggesting a phrase for all matching words. For instance, the same manufacturer has a lot of models available. We have to ensure that we list all models if a user decides to search for a manufacturer by name. Order the results based on overall popularity and relevance of the match instead of ordering lexicographically. The following screenshot shows the typical autosuggest box, which you find in popular e-commerce portals. This feature improves the user experience and also reduces the spell errors: For this case, we will use a sorted set along with hashes. We will use a sorted set to store the n-gram of the indexed data followed by getting the complete title from hashes. Instead of storing the n-grams into the same sorted set, we will store them in different sorted sets. Let's look at the following scenario in which we have model names of mobile phones along with their popularity: For this set, we will create multiple sorted sets. Let's take Apple iPhone 5S: ZADD a 9 apple_iphone_5s ZADD ap 9 apple_iphone_5s ZADD app 9 apple_iphone_5s ZADD apple 9 apple_iphone_5s ZADD i 9 apple_iphone_5s ZADD ip 9 apple_iphone_5s ZADD iph 9 apple_iphone_5s ZADD ipho 9 apple_iphone_5s ZADD iphon 9 apple_iphone_5s ZADD iphone 9 apple_iphone_5s ZADD 5 9 apple_iphone_5s ZADD 5s 9 apple_iphone_5s HSET titles apple_iphone_5s "Apple iPhone 5S" In the preceding scenario, we have added every n-gram value as a sorted set and created a hash that holds the original title. Likewise, we have to add all the titles into our index. Searching in the index Now that we have indexed the titles, we are ready to perform a search. Consider a situation where a user is querying with the term apple. We want to show the user the five best suggestions based on the popularity of the product. Here's how we can achieve this: > zrevrange apple 0 4 withscores 1) "apple_iphone_5s" 2) 9.0 3) "apple_iphone_5c" 4) 6.0 As the elements inside the sorted set are ordered by the element score, we get the matches ordered by the popularity which we inserted. To get the original title, type the following command: > hmget titles apple_iphone_5s 1) "Apple iPhone 5S" In the preceding scenario case, the query was a single word. Now imagine if the user has multiple words such as Samsung nex, and we have to suggest the autocomplete as Samsung Galaxy Nexus. To achieve this, we will use ZINTERSTORE as follows: > zinterstore samsung_nex 2 samsung nex aggregate max ZINTERSTORE destination numkeys key [key ...] [WEIGHTS weight [weight ...]] [AGGREGATE SUM|MIN|MAX] This computes the intersection of sorted sets given by the specified keys and stores the result in a destination. It is mandatory to provide the number of input keys before passing the input keys and other (optional) arguments. For more information about ZINTERSTORE, visit http://redis.io/commands/ZINTERSTORE. The previous command, which is zinterstore samsung_nex 2 samsung nex aggregate max, will compute the intersection of two sorted sets, samsung and nex, and stores it in another sorted set, samsung_nex. To see the result, type the following commands: > zrevrange samsung_nex 0 4 withscores 1) samsung_galaxy_nexus 2) 7 > hmget titles samsung_galaxy_nexus 1) Samsung Galaxy Nexus If you want to cache the result for multiword queries and remove it automatically, use an EXPIRE command and set expiry for temporary keys. Summary In this article, we have seen how to perform autosuggest and faceted searches using Redis. We have also understood how sorted sets and sets work. We have also seen how Redis can be used as a backend system for simple faceting and autosuggest system and make the system ultrafast. Further resources on this subject: Using Redis in a hostile environment (Advanced) [Article] Building Applications with Spring Data Redis [Article] Implementing persistence in Redis (Intermediate) [Article] Resources used for creating the article: Credit for the featured tiger image: Big Cat Facts - Tiger
Read more
  • 0
  • 0
  • 3301
Banner background image

Packt
18 Sep 2014
6 min read
Save for later

Waiting for AJAX, as always…

Packt
18 Sep 2014
6 min read
In this article, by Dima Kovalenko, author of the book, Selenium Design Patterns and Best Practices, we will learn how test automation have progressed over the period of time. Test automation was simpler in the good old days, before asynchronous page loading became mainstream. Previously the test would click on a button causing the whole page to reload; after the new page load we could check if any errors were displayed. The act of waiting for the page to load guaranteed that all of the items on the page are already there, and if expected element was missing our test could fail with confidence. Now, an element might be missing for several seconds, and magically show up after an unspecified delay. The only thing for a test to do is become smarter! (For more resources related to this topic, see here.) Filling out credit card information is a common test for any online store. Let’s take a look at a typical credit card form: Our form has some default values for user to fill out, and a quick JavaScript check that the required information was entered into the field, by adding Done next to a filled out input field, like this: Once all of the fields have been filled out and seem correct, JavaScript makes the Purchase button clickable. Clicking on the button will trigger an AJAX request for the purchase, followed by successful purchase message, like this: Very simple and straight forward, anyone who has made an online purchase has seen some variation of this form. Writing a quick test to fill out the form and make sure the purchase is complete should be a breeze! Testing AJAX with sleep method Let’s take a look at a simple test, written to test this form. Our tests are written in Ruby for this demonstration for easy of readability. However, this technique will work in Java or any other programming language you may choose to use. To follow along with this article, please make sure you have Ruby and selenium-webdriver gem installed. Installers for both can be found here https://www.ruby-lang.org/en/installation/ and http://rubygems.org/gems/selenium-webdriver. Our test file starts like this: If this code looks like a foreign language to you, don’t worry we will walk through it until it all makes sense. First three lines of the test file specify all of the dependencies such as selenium-webdriver gem. On line five, we declare our test class as TestAjax which inherits its behavior from the Test::Unit framework we required on line two. The setup and teardown methods will take care of the Selenium instance for us. In the setup we create a new instance of Firefox browser and navigate to a page, which contains the mentioned form; the teardown method closes the browser after the test is complete. Now let’s look at the test itself: Lines 17 to 21 fill out the purchase form with some test data, followed by an assertion that Purchase complete! text appears in the DIV with ID of success. Let’s run this test to see if it passes. The following is the output result of our test run; as you can see it’s a failure: Our test fails because it was expecting to see Purchase complete! right here: But no text was found, because the AJAX request took a much longer time than expected. The AJAX request in progress indicator is seen here: Since this AJAX request can take anywhere from 15 to 30 seconds to complete, the most logical next step is to add a pause in between the click on the Purchase button and the test assertion; shown as follows: However, this obvious solution is really bad for two reasons: If majority of AJAX requests take 15 seconds to run, than our test is wasting another 15 seconds waiting for things instead continuing. If our test environment is under heavy load, the AJAX request can take as long as 45 seconds to complete, so our test will fail. The better choice is to make our tests smart enough to wait for AJAX request to complete, instead of using a sleep method. using smart AJAX waits To solve the shortcomings of the sleep methods we will create a new method called wait_for_ajax, seen here: In this method, we use the Wait class built into the WebDriver. The until method in the Wait class allows us to pause the test execution for an arbitrary reason. In this case to sleep for 1 second, on line 29, and to execute a JavaScript command in the browser with the help of the execute_script method. This method allows us to run a JavaScript snippet in the current browser window on the current page, which gives us access to all of the variables and methods that JavaScript has. The snippet of JavaScript that we are sending to the browser is a query against jQuery framework. The active method in jQuery returns an integer of currently active AJAX requests. Zero means that the page is fully loaded, and there are no background HTTP requests happening. On line 30, we ask the execute_script to return the current active count of AJAX requests happening on the page, and if the returned value equals 0 we break out of the Wait loop. Once the loop is broken, our tests can continue on their way. Note that the upper limit of the wait_for_ajax method is set to 60 seconds on line 28. This value can be increased or decreased, depending on how slow the test environment is. Let’s replace the sleep method call with our newly created method, shown here: And run our tests one more time, to see this passing result: Now that we stabilized our test against slow and unpredictable AJAX requests, we need to add a method that will wait for JavaScript animations to finish. These animations can break our tests just as much as the AJAX requests. Also, are tests are incredibly vulnerable due third party slowness; such as when the Facebook Like button takes long time to load. Summary This article introduced you to using a simple method that intelligently waits for all of the AJAX requests to complete, we have increased the overall stability of our test and test suite. Furthermore, we have removed a wasteful delay, which adds unnecessary delay in our test execution. In conclusion, we have improved the test stability while at the same time making our test run faster! Resources for Article: Further resources on this subject: Quick Start into Selenium Tests [article] Behavior-driven Development with Selenium WebDriver [article] Exploring Advanced Interactions of WebDriver [article]
Read more
  • 0
  • 0
  • 825

article-image-index-item-sharding-and-projection-dynamodb
Packt
17 Sep 2014
13 min read
Save for later

Index, Item Sharding, and Projection in DynamoDB

Packt
17 Sep 2014
13 min read
Understanding the secondary index and projections should go hand in hand because of the fact that a secondary index cannot be used efficiently without specifying projection. In this article by Uchit Vyas and Prabhakaran Kuppusamy, authors of DynamoDB Applied Design Patterns, we will take a look at local and global secondary indexes, and projection and its usage with indexes. (For more resources related to this topic, see here.) The use of projection in DynamoDB is pretty much similar to that of traditional databases. However, here are a few things to watch out for: Whenever a DynamoDB table is created, it is mandatory to create a primary key, which can be of a simple type (hash type), or it can be of a complex type (hash and range key). For the specified primary key, an index will be created (we call this index the primary index). Along with this primary key index, the user is allowed to create up to five secondary indexes per table. There are two kinds of secondary index. The first is a local secondary index (in which the hash key of the index must be the same as that of the table) and the second is the global secondary index (in which the hash key can be any field). In both of these secondary index types, the range key can be a field that the user needs to create an index for. Secondary indexes A quick question: while writing a query in any database, keeping the primary key field as part of the query (especially in the where condition) will return results much faster compared to the other way. Why? This is because of the fact that an index will be created automatically in most of the databases for the primary key field. This the case with DynamoDB also. This index is called the primary index of the table. There is no customization possible using the primary index, so the primary index is seldom discussed. In order to make retrieval faster, the frequently-retrieved attributes need to be made as part of the index. However, a DynamoDB table can have only one primary index and the index can have a maximum of two attributes (hash and range key). So for faster retrieval, the user should be given privileges to create user-defined indexes. This index, which is created by the user, is called the secondary index. Similar to the table key schema, the secondary index also has a key schema. Based on the key schema attributes, the secondary index can be either a local or global secondary index. Whenever a secondary index is created, during every item insertion, the items in the index will be rearranged. This rearrangement will happen for each item insertion into the table, provided the item contains both the index's hash and range key attribute. Projection Once we have an understanding of the secondary index, we are all set to learn about projection. While creating the secondary index, it is mandatory to specify the hash and range attributes based on which the index is created. Apart from these two attributes, if the query wants one or more attribute (assuming that none of these attributes are projected into the index), then DynamoDB will scan the entire table. This will consume a lot of throughput capacity and will have comparatively higher latency. The following is the table (with some data) that is used to store book information: Here are few more details about the table: The BookTitle attribute is the hash key of the table and local secondary index The Edition attribute is the range key of the table The PubDate attribute is the range key of the index (let's call this index IDX_PubDate) Local secondary index While creating the secondary index, the hash and range key of the table and index will be inserted into the index; optionally, the user can specify what other attributes need to be added. There are three kinds of projection possible in DynamoDB: KEYS_ONLY: Using this, the index consists of the hash and range key values of the table and index INCLUDE: Using this, the index consists of attributes in KEYS_ONLY plus other non-key attributes that we specify ALL: Using this, the index consists of all of the attributes from the source table The following code shows the creation of a local secondary index named Idx_PubDate with BookTitle as the hash key (which is a must in the case of a local secondary index), PubDate as the range key, and using the KEYS_ONLY projection: private static LocalSecondaryIndex getLocalSecondaryIndex() { ArrayList<KeySchemaElement> indexKeySchema =    newArrayList<KeySchemaElement>(); indexKeySchema.add(new KeySchemaElement()    .withAttributeName("BookTitle")    .withKeyType(KeyType.HASH)); indexKeySchema.add(new KeySchemaElement()    .withAttributeName("PubDate")    .withKeyType(KeyType.RANGE)); LocalSecondaryIndex lsi = new LocalSecondaryIndex()    .withIndexName("Idx_PubDate")    .withKeySchema(indexKeySchema)    .withProjection(new Projection()    .withProjectionType("KEYS_ONLY")); return lsi; } The usage of the KEYS_ONLY index type will create the smallest possible index and the usage of ALL will create the biggest possible index. We will discuss the trade-offs between these index types a little later. Going back to our example, let us assume that we are using the KEYS_ONLY index type, so none of the attributes (other than the previous three key attributes) are projected into the index. So the index will look as follows: You may notice that the row order of the index is almost the same as that of the table order (except the second and third rows). Here, you can observe one point: the table records will be grouped primarily based on the hash key, and then the records that have the same hash key will be ordered based on the range key of the index. In the case of the index, even though the table's range key is part of the index attribute, it will not play any role in the ordering (only the index's hash and range keys will take part in the ordering). There is a negative in this approach. If the user is writing a query using this index to fetch BookTitle and Publisher with PubDate as 28-Dec-2008, then what happens? Will DynamoDB complain that the Publisher attribute is not projected into the index? The answer is no. The reason is that even though Publisher is not projected into the index, we can still retrieve it using the secondary index. However, retrieving a nonprojected attribute will scan the entire table. So if we are sure that certain attributes need to be fetched frequently, then we must project it into the index; otherwise, it will consume a large number of capacity units and retrieval will be much slower as well. One more question: if the user is writing a query using the local secondary index to fetch BookTitle and Publisher with PubDate as 28-Dec-2008, then what happens? Will DynamoDB complain that the PubDate attribute is not part of the primary key and hence queries are not allowed on nonprimary key attributes? The answer is no. It is a rule of thumb that we can write queries on the secondary index attributes. It is possible to include nonprimary key attributes as part of the query, but these attributes must at least be key attributes of the index. The following code shows how to add non-key attributes to the secondary index's projection: private static Projection getProjectionWithNonKeyAttr() { Projection projection = new Projection()    .withProjectionType(ProjectionType.INCLUDE); ArrayList<String> nonKeyAttributes = new ArrayList<String>(); nonKeyAttributes.add("Language"); nonKeyAttributes.add("Author2"); projection.setNonKeyAttributes(nonKeyAttributes); return projection; } There is a slight limitation with the local secondary index. If we write a query on a non-key (both table and index) attribute, then internally DynamoDB might need to scan the entire table; this is inefficient. For example, consider a situation in which we need to retrieve the number of editions of the books in each and every language. Since both of the attributes are non-key, even if we create a local secondary index with either of the attributes (Edition and Language), the query will still result in a scan operation on the entire table. Global secondary index A problem arises here: is there any way in which we can create a secondary index using both the index keys that are different from the table's primary keys? The answer is the global secondary index. The following code shows how to create the global secondary index for this scenario: private static GlobalSecondaryIndex getGlobalSecondaryIndex() { GlobalSecondaryIndex gsi = new GlobalSecondaryIndex()    .withIndexName("Idx_Pub_Edtn")    .withProvisionedThroughput(new ProvisionedThroughput()    .withReadCapacityUnits((long) 1)    .withWriteCapacityUnits((long) 1))    .withProjection(newProjection().withProjectionType      ("KEYS_ONLY"));   ArrayList<KeySchemaElement> indexKeySchema1 =    newArrayList<KeySchemaElement>();   indexKeySchema1.add(new KeySchemaElement()    .withAttributeName("Language")    .withKeyType(KeyType.HASH)); indexKeySchema1.add(new KeySchemaElement()    .withAttributeName("Edition")    .withKeyType(KeyType.RANGE));   gsi.setKeySchema(indexKeySchema1); return gsi; } While deciding the attributes to be projected into a global secondary index, there are trade-offs we must consider between provisioned throughput and storage costs. A few of these are listed as follows: If our application doesn't need to query a table so often and it performs frequent writes or updates against the data in the table, then we must consider projecting the KEYS_ONLY attributes. The global secondary index will be minimum size, but it will still be available when required for the query activity. The smaller the index, the cheaper the cost to store it and our write costs will be cheaper too. If we need to access only those few attributes that have the lowest possible latency, then we must project only those (lesser) attributes into a global secondary index. If we need to access almost all of the non-key attributes of the DynamoDB table on a frequent basis, we can project these attributes (even the entire table) into the global secondary index. This will give us maximum flexibility with the trade-off that our storage cost would increase, or even double if we project the entire table's attributes into the index. The additional storage costs to store the global secondary index might equalize the cost of performing frequent table scans. If our application will frequently retrieve some non-key attributes, we must consider projecting these non-key attributes into the global secondary index. Item sharding Sharding, also called horizontal partitioning, is a technique in which rows are distributed among the database servers to perform queries faster. In the case of sharding, a hash operation will be performed on the table rows (mostly on one of the columns) and, based on the hash operation output, the rows will be grouped and sent to the proper database server. Take a look at the following diagram: As shown in the previous diagram, if all the table data (only four rows and one column are shown for illustration purpose) is stored in a single database server, the read and write operations will become slower and the server that has the frequently accessed table data will work more compared to the server storing the table data that is not accessed frequently. The following diagram shows the advantage of sharding over a multitable, multiserver database environment: In the previous diagram, two tables (Tbl_Places and Tbl_Sports) are shown on the left-hand side with four sample rows of data (Austria.. means only the first column of the first item is illustrated and all other fields are represented by ..).We are going to perform a hash operation on the first column only. In DynamoDB, this hashing will be performed automatically. Once the hashing is done, similar hash rows will be saved automatically in different servers (if necessary) to satisfy the specified provisioned throughput capacity. Have you ever wondered about the importance of the hash type key while creating a table (which is mandatory)? Of course we all know the importance of the range key and what it does. It simply sorts items based on the range key value. So far, we might have been thinking that the range key is more important than the hash key. If you think that way, then you may be correct, provided we neither need our table to be provisioned faster nor do we need to create any partitions for our table. As long as the table data is smaller, the importance of the hash key will be realized only while writing a query operation. However, once the table grows, in order to satisfy the same provision throughput, DynamoDB needs to partition our table data based on this hash key (as shown in the previous diagram). This partitioning of table items based on the hash key attribute is called sharding. It means the partitions are created by splitting items and not attributes. This is the reason why a query that has the hash key (of table and index) retrieves items much faster. Since the number of partitions is managed automatically by DynamoDB, we cannot just hope for things to work fine. We also need to keep certain things in mind, for example, the hash key attribute should have more distinct values. To simplify, it is not advisable to put binary values (such as Yes or No, Present or Past or Future, and so on) into the hash key attributes, thereby restricting the number of partitions. If our hash key attribute has either Yes or No values in all the items, then DynamoDB can create only a maximum of two partitions; therefore, the specified provisioned throughput cannot be achieved. Just consider that we have created a table called Tbl_Sports with a provisioned throughput capacity of 10, and then we put 10 items into the table. Assuming that only a single partition is created, we are able to retrieve 10 items per second. After a point of time, we put 10 more items into the table. DynamoDB will create another partition (by hashing over the hash key), thereby satisfying the provisioned throughput capacity. There is a formula taken from the AWS site: Total provisioned throughput/partitions = throughput per partition OR No. of partitions = Total provisioned throughput/throughput per partition In order to satisfy throughput capacity, the other parameters will be automatically managed by DynamoDB. Summary In this article, we saw what the local and global secondary indexes are. We walked through projection and its usage with indexes. Resources for Article: Further resources on this subject: Comparative Study of NoSQL Products [Article] Ruby with MongoDB for Web Development [Article] Amazon DynamoDB - Modelling relationships, Error handling [Article]
Read more
  • 0
  • 0
  • 5324

article-image-security-settings-salesforce
Packt
11 Sep 2014
10 min read
Save for later

Security Settings in Salesforce

Packt
11 Sep 2014
10 min read
In the article by Rakesh Gupta and Sagar Pareek, authors of Salesforce.com Customization Handbook, we will discuss Organization-Wide Default (OWD) and various ways to share records. We will also discuss the various security settings in Salesforce. The following topics will be covered in this article: (For more resources related to this topic, see here.) Concepts of OWD The sharing rule Field-Level Security and its effect on data visibility Setting up password polices Concepts of OWD Organization-Wide Default is also known as OWD. This is the base-level sharing and setting of objects in your organization. By using this, you can secure your data so that other users can't access data that they don't have access to. The following diagram shows the basic database security in Salesforce. In this, OWD plays a key role. It's a base-level object setting in the organization, and you can't go below this. So here, we will discuss OWD in Salesforce. Let's start with an example. Sagar Pareek is the system administrator in Appiuss. His manager Sara Barellies told him that the user who has created or owns the account records as well as the users that are higher in the role hierarchy can access the records. Here, you have to think first about OWD because it is the basic thing to restrict object-level access in Salesforce. To achieve this, Sagar Pareek has to set Organization-Wide Default for the account object to private. Setting up OWD To change or update OWD for your organization, follow these steps: Navigate to Setup | Administer | Security Controls | Sharing Settings. From the Manage sharing settings for drop-down menu, select the object for which you want to change OWD. Click on Edit. From the Default Access drop-down menu, select an access as per your business needs. For the preceding scenario, select Private to grant access to users who are at a high position in the role hierarchy, by selecting Grant access using hierarchy. For standard objects, it is automatically selected, and for custom objects, you have the option to select it. Click on Save. The following table describes the various types of OWD access and their respective description: OWD access Description Private Only the owner of the records and the higher users in the role hierarchy are able to access and report on the records. Public read only All users can view the records, but only the owners and the users higher in the role hierarchy can edit them. Public read/write All users can view, edit, and report on all records. Public read/write/ transfer All users can view, edit, transfer, and report on all records. This is only available for case and lead objects. Controlled by parent This says that access on the child object's records is controlled by the parent. Public full access This is available for campaigns. In this, all users can view, edit, transfer, and report on all records.   You can assign this access to campaigns, accounts, cases, contacts, contracts, leads, opportunities, users, and custom objects. This feature is only available for Professional, Enterprise, Unlimited, Performance, Developer, and Database Editions. Basic OWD settings for objects Whenever you buy your Salesforce Instance, it comes with the predefined OWD settings for standard objects. You can change them anytime by following the path Setup | Administer | Security Controls | Sharing Settings. The following table describes the default access to objects: Object Default access Account Public read/write Activity Private Asset Public read/write Campaign Public full access Case Public read/write transfer Contact Controlled by parent (that is, account) Contract Public read/write Custom Object Public read/write Lead Public read/write transfer Opportunity Public read only Users Public read only and private for external users Let's continue with another example. Sagar Pareek is the system administrator in Appiuss. His manager Sara Barellies told him that only the users who created the record for the demo object can access the records, and no one else can have the power to view/edit/delete it. To do this, you have to change OWD for a demo object to private, and don't select Grant Access Using Hierarchy. When you select the Grant Access Using Hierarchy field, it provides access to people who are above in the role hierarchy. Sharing Rule To open the record-level access for a group of users, roles, or roles and subordinates beyond OWD, you can use Sharing Rule. Sharing Rule is used for open access; you can't use Sharing Rule to restrict access. Let's start with an example where Sagar Pareek is the system administrator in Appiuss. His manager Sara Barellies wants every user in the organization to be able to view the account records but only a group of users (all the users do not belong to the same role or have the same profile) can edit it. To solve the preceding business requirement, you have to follow these steps: First, change the OWD account to Public Read Only by following the path Setup | Administer | Security Controls | Sharing Settings, so all users from the organization can view the account records. Now, create a public group Account access and add users as per the business requirement. To create a public group, follow the path Name | Setup | Administration Setup | Manage Users | Public Groups. Finally, you have to create a sharing rule. To create sharing rules, follow the path Setup | Administer | Security Controls | Sharing Settings, and navigate to the list related to Account Sharing Rules: Click on New, and it will redirect you to a new window where you have to enter Label, Rule Name, and Description (always write a description so that other administrators or developers get to know why this rule was created). Then, for Rule Type, select Based on criteria. Select the criteria by which records are to be shared and create a criterion so that all records fall under it (such as Account Name not equal to null). Select Public Groups in the Share with option and your group name. Select the level of access for the users. Here, select Read/Write from the drop-down menu of Default Account, Contract and Asset Access. Finally, it will look like the following screenshot: Types of Sharing Rules What we did to solve the preceding business requirement is called Sharing Rule. There is a limitation on Sharing Rules; you can write only 50 Sharing Rules (criteria-based) and 300 Sharing Rules (both owner- and criteria-based) per object. The following are the types of Sharing Rules in Salesforce: Manual Sharing: Only when OWD is set to Private or Public Read for any object will a sharing button be enabled in the record detail page. Record owners or users, who are at a higher position in role and hierarchy, can share records with other users. For the last business use case, we changed the account OWD to Public Read Only. If you navigate to the Account records detail page, you can see the Sharing button: Click on the Sharing button and it will redirect you to a new window. Now, click on Add and you are ready to share records with the following: Public groups Users Roles Roles and subordinates Select the access type for each object and click on Save. It will look like what is shown in the following screenshot: The Lead and Case Sharing buttons will be enabled when OWD is Private, Public Read Only, and Public Read/Write. Apex Sharing: When all other Sharing Rules can't fulfill your requirements, then you can use the Apex Sharing method to share records. It gives you the flexibility to handle complex sharing. Apex-managed sharing is a type of programmatic sharing that allows you to define a custom sharing reason to associate with your programmatic share. Standard Salesforce objects support programmatic sharing while custom objects support Apex-managed sharing. Field-Level Security and its effect on data visibility Data on fields is very important for any organization. They want to show some data to the field-specific users. In Salesforce, you can use Field-Level Security to make fields hidden or read-only for a specific profile. There are three ways in Salesforce to set Field-Level Security: From an object-field From a profile Field accessibility From an object-field Let's start with an example where Sagar Pareek is the system administrator in Appiuss. His manager Sara Barellies wants to create a field (phone) on an account object and make this field read-only for all users and also allowing system administrators to edit the field. To solve this business requirement, follow these steps: Navigate to Setup | Customize | Account | Fields and then click on the Phone (it's a hyperlink) field. It will redirect you to the detail page of the Phone field; you will see a page like the following screenshot: Click on the Set Field-Level Security button, and it will redirect you to a new page where you can set the Field-Level Security. Select Visible and Read-Only for all the profiles other than that of the system administrator. For the system administrator, select only Visible. Click on Save. If you select Read-Only, the visible checkbox will automatically get selected. From a profile Similarly, in Field-Level settings, you can also achieve the same results from a profile. Let's follow the preceding business use case to be achieved through the profile. To do this, follow these steps: Navigate to Setup | Administer | Manage Users | Profile, go to the System Administrator profile, and click on it. Now, you are on the profile detail page. Navigate to the Field-Level Security section. It will look like the following screenshot: Click on the View link beside the Account option. It will open Account Field-Level Security for the profile page. Click on the Edit button and edit Field-Level Security as we did in the previous section. Field accessibility We can achieve the same outcome by using field accessibility. To do this, follow these steps: Navigate to Setup | Administer | Security Controls | Field Accessibility. Click on the object name; in our case, it's Account. It will redirect you to a new page where you can select View by Fields or View by Profiles: In our case, select View by Fields and then select the field Phone. Click on the editable link as shown in the following screenshot: It will open the Access Settings for Account Field page, where you can edit the Field-Level Security. Once done, click on Save. Setting up password policies For security purposes, Salesforce provides an option to set password policies for the organization. Let's start with an example. Sagar Pareek, the system administrator of an organization, has decided to create a policy regarding the password for the organization, where the password of each user must be of 10 characters and must be a combination of alphanumeric and special characters. To do this, he will have to follow these steps: Navigate to Setup | Security Controls | Password Policies. It will open the Password Policies setup page: In the Minimum password length field, select 10 characters. In the Password complexity requirement field, select Must mix Alpha, numeric and special characters. Here, you can also decide when the password should expire under the User password expire in option. Enforce the password history under the option enforce password history, and set a password question requirement as well as the number of invalid attempts allowed and the lock-out period. Click on Save. Summary In this article, we have gone through various security setting features available on Salesforce. Starting from OWD, followed by Sharing Rules and Field-Level Security, we also covered password policy concepts. Resources for Article: Further resources on this subject: Introducing Salesforce Chatter [Article] Salesforce CRM Functions [Article] Adding a Geolocation Trigger to the Salesforce Account Object [Article]
Read more
  • 0
  • 0
  • 1047
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at €14.99/month. Cancel anytime
article-image-introducing-llvm-intermediate-representation
Packt
26 Aug 2014
18 min read
Save for later

Introducing LLVM Intermediate Representation

Packt
26 Aug 2014
18 min read
In this article by Bruno Cardoso Lopez and Rafael Auler, the authors of Getting Started with LLVM Core Libraries, we will look into some basic concepts of the LLVM intermediate representation (IR). (For more resources related to this topic, see here.) LLVM IR is the backbone that connects frontends and backends, allowing LLVM to parse multiple source languages and generate code to multiple targets. Frontends produce the IR, while backends consume it. The IR is also the point where the majority of LLVM target-independent optimizations takes place. Overview The choice of the compiler IR is a very important decision. It determines how much information the optimizations will have to make the code run faster. On one hand, a very high-level IR allows optimizers to extract the original source code intent with ease. On the other hand, a low-level IR allows the compiler to generate code tuned for a particular hardware more easily. The more information you have about the target machine, the more opportunities you have to explore machine idiosyncrasies. Moreover, the task at lower levels must be done with care. As the compiler translates the program to a representation that is closer to machine instructions, it becomes increasingly difficult to map program fragments to the original source code. Furthermore, if the compiler design is exaggerated using a representation that represents a specific target machine very closely, it becomes awkward to generate code for other machines that have different constructs. This design trade-off has led to different choices among compilers. Some compilers, for instance, do not support code generation for multiple targets and focus on only one machine architecture. This enables them to use specialized IRs throughout their entire pipeline that make the compiler efficient with respect to a single architecture, which is the case of the Intel C++ Compiler (icc). However, writing compilers that generate code for a single architecture is an expensive solution if you aim to support multiple targets. In these cases, it is unfeasible to write a different compiler for each architecture, and it is best to design a single compiler that performs well on a variety of targets, which is the goal of compilers such as GCC and LLVM. For these projects, called retargetable compilers, there are substantially more challenges to coordinate the code generation for multiple targets. The key to minimizing the effort to build a retargetable compiler lies in using a common IR, the point where different backends share the same understanding about the source program to translate it to a divergent set of machines. Using a common IR, it is possible to share a set of target-independent optimizations among multiple backends, but this puts pressure on the designer to raise the level of the common IR to not overrepresent a single machine. Since working at higher levels precludes the compiler from exploring target-specific trickery, a good retargetable compiler also employs other IRs to perform optimizations at different, lower levels. The LLVM project started with an IR that operated at a lower level than the Java bytecode, thus, the initial acronym was Low Level Virtual Machine. The idea was to explore low-level optimization opportunities and employ link-time optimizations. The link-time optimizations were made possible by writing the IR to disk, as in a bytecode. The bytecode allows the user to amalgamate multiple modules in the same file and then apply interprocedural optimizations. In this way, the optimizations will act on multiple compilation units as if they were in the same module. LLVM, nowadays, is neither a Java competitor nor a virtual machine, and it has other intermediate representations to achieve efficiency. For example, besides the LLVM IR, which is the common IR where target-independent optimizations work, each backend may apply target-dependent optimizations when the program is represented with the MachineFunction and MachineInstr classes. These classes represent the program using target-machine instructions. On the other hand, the Function and Instruction classes are, by far, the most important ones because they represent the common IR that is shared across multiple targets. This intermediate representation is mostly target-independent (but not entirely) and the official LLVM intermediate representation. To avoid confusion, while LLVM has other levels to represent a program, which technically makes them IRs as well, we do not refer to them as LLVM IRs; however, we reserve this name for the official, common intermediate representation by the Instruction class, among others. This terminology is also adopted by the LLVM documentation. The LLVM project started as a set of tools that orbit around the LLVM IR, which justifies the maturity of the optimizers and the number of optimizers that act at this level. This IR has three equivalent forms: An in-memory representation (the Instruction class, among others) An on-disk representation that is encoded in a space-efficient form (the bitcode files) An on-disk representation in a human-readable text form (the LLVM assembly files) LLVM provides tools and libraries that allow you to manipulate and handle the IR in all forms. Hence, these tools can transform the IR back and forth, from memory to disk as well as apply optimizations, as illustrated in the following diagram: Understanding the LLVM IR target dependency The LLVM IR is designed to be as target-independent as possible, but it still conveys some target-specific aspects. Most people blame the C/C++ language for its inherent, target-dependent nature. To understand this, consider that when you use standard C headers in a Linux system, for instance, your program implicitly imports some header files from the bits Linux headers folder. This folder contains target-dependent header files, including macro definitions that constrain some entities to have a particular type that matches what the syscalls of this kernel-machine expect. Afterwards, when the frontend parses your source code, it needs to also use different sizes for int, for example, depending on the intended target machine where this code will run. Therefore, both library headers and C types are already target-dependent, which makes it challenging to generate an IR that can later be translated to a different target. If you consider only the target-dependent, C standard library headers, the parsed AST for a given compilation unit is already target-dependent, even before the translation to the LLVM IR. Furthermore, the frontend generates IR code using type sizes, calling conventions, and special library calls that match the ones defined by each target ABI. Still, the LLVM IR is quite versatile and is able to cope with distinct targets in an abstract way. Exercising basic tools to manipulate the IR formats We mention that the LLVM IR can be stored on disk in two formats: bitcode and assembly text. We will now learn how to use them. Consider the sum.c source code: int sum(int a, int b) { return a+b; } To make Clang generate the bitcode, you can use the following command: $ clang sum.c -emit-llvm -c -o sum.bc To generate the assembly representation, you can use the following command: $ clang sum.c -emit-llvm -S -c -o sum.ll You can also assemble the LLVM IR assembly text, which will create a bitcode: $ llvm-as sum.ll -o sum.bc To convert from bitcode to IR assembly, which is the opposite, you can use the disassembler: $ llvm-dis sum.bc -o sum.ll The llvm-extract tool allows the extraction of IR functions, globals, and also the deletion of globals from the IR module. For instance, extract the sum function from sum.bc with the following command: $ llvm-extract -func=sum sum.bc -o sum-fn.bc Nothing changes between sum.bc and sum-fn.bc in this particular example since sum is already the sole function in this module. Introducing the LLVM IR language syntax Observe the LLVM IR assembly file, sum.ll: target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64-S128" target triple = "x86_64-apple-macosx10.7.0" define i32 @sum(i32 %a, i32 %b) #0 { entry: %a.addr = alloca i32, align 4 %b.addr = alloca i32, align 4 store i32 %a, i32* %a.addr, align 4 store i32 %b, i32* %b.addr, align 4 %0 = load i32* %a.addr, align 4 %1 = load i32* %b.addr, align 4 %add = add nsw i32 %0, %1 ret i32 %add } attributes #0 = { nounwind ssp uwtable ... } The contents of an entire LLVM file, either assembly or bitcode, are said to define an LLVM module. The module is the LLVM IR top-level data structure. Each module contains a sequence of functions, which contains a sequence of basic blocks that contain a sequence of instructions. The module also contains peripheral entities to support this model, such as global variables, the target data layout, and external function prototypes as well as data structure declarations. LLVM local values are the analogs of the registers in the assembly language and can have any name that starts with the % symbol. Thus, %add = add nsw i32 %0, %1 will add the local value %0 to %1 and put the result in the new local value, %add. You are free to give any name to the values, but if you are short on creativity, you can just use numbers. In this short example, we can already see how LLVM expresses its fundamental properties: It uses the Static Single Assignment (SSA) form. Note that there is no value that is reassigned; each value has only a single assignment that defines it. Each use of a value can immediately be traced back to the sole instruction responsible for its definition. This has an immense value to simplify optimizations, owing to the trivial use-def chains that the SSA form creates, that is, the list of definitions that reaches a user. If LLVM had not used the SSA form, we would need to run a separate data flow analysis to compute the use-def chains, which are mandatory for classical optimizations such as constant propagation and common subexpression elimination. Code is organized as three-address instructions. Data processing instructions have two source operands and place the result in a distinct destination operand. It has an infinite number of registers. Note how LLVM local values can be any name that starts with the % symbol, including numbers that start at zero, such as %0, %1, and so on, that have no restriction on the maximum number of distinct values. The target datalayout construct contains information about endianness and type sizes for target triple that is described in target host. Some optimizations depend on knowing the specific data layout of the target to transform the code correctly. Observe how the layout declaration is done: target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64-S128" target triple = "x86_64-apple-macosx10.7.0" We can extract the following facts from this string: The target is an x86_64 processor with macOSX 10.7.0. It is a little-endian target, which is denoted by the first letter in the layout (a lowercase e). Big-endian targets need to use an uppercase E. The information provided about types is in the format type:<size>:<abi>:<preferred>. In the preceding example, p:64:64:64 represents a pointer that is 64 bits wide in size, with the abi and preferred alignments set to the 64-bit boundary. The ABI alignment specifies the minimum required alignment for a type, while the preferred alignment specifies a potentially larger value, if this will be beneficial. The 32-bit integer types i32:32:32 are 32 bits wide in size, 32-bit abi and preferred alignment, and so on. The function declaration closely follows the C syntax: define i32 @sum(i32 %a, i32 %b) #0 { This function returns a value of the type i32 and has two i32 arguments, %a and %b. Local identifiers always need the % prefix, whereas global identifiers use @. LLVM supports a wide range of types, but the most important ones are the following: Arbitrary-sized integers in the iN form; common examples are i32, i64, and i128. Floating-point types, such as the 32-bit single precision float and 64-bit double precision double. Vectors types in the format <<# elements> x <elementtype>>. A vector with four i32 elements is written as <4 x i32>. The #0 tag in the function declaration maps to a set of function attributes, also very similar to the ones used in C/C++ functions and methods. The set of attributes is defined at the end of the file: attributes #0 = { nounwind ssp uwtable "less-precise-fpmad"="false""no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf"="true""no-infs-fp-math"="false" "no-nans-fp-math"="false" "unsafe-fp-math"="false""use-soft-float"="false" } For instance, nounwind marks a function or method as not throwing exceptions, and ssp tells the code generator to use a stack smash protector in an attempt to increase the security of this code against attacks. The function body is explicitly divided into basic blocks (BBs), and a label is used to start a new BB. A label relates to a basic block in the same way that a value identifier relates to an instruction. If a label declaration is omitted, the LLVM assembler automatically generates one using its own naming scheme. A basic block is a sequence of instructions with a single entry point at its first instruction, and a single exit point at its last instruction. In this way, when the code jumps to the label that corresponds to a basic block, we know that it will execute all of the instructions in this basic block until the last instruction, which will change the control flow by jumping to another basic block. Basic blocks and their associated labels need to adhere to the following conditions: Each BB needs to end with a terminator instruction, one that jumps to other BBs or returns from the function The first BB, called the entry BB, is special in an LLVM function and must not be the target of any branch instructions Our LLVM file, sum.ll, has only one BB because it has no jumps, loops, or calls. The function start is marked with the entry label, and it ends with the return instruction, ret: entry: %a.addr = alloca i32, align 4 %b.addr = alloca i32, align 4 store i32 %a, i32* %a.addr, align 4 store i32 %b, i32* %b.addr, align 4 %0 = load i32* %a.addr, align 4 %1 = load i32* %b.addr, align 4 %add = add nsw i32 %0, %1 ret i32 %add The alloca instruction reserves space on the stack frame of the current function. The amount of space is determined by element type size, and it respects a specified alignment. The first instruction, %a.addr = alloca i32, align 4, allocates a 4-byte stack element, which respects a 4-byte alignment. A pointer to the stack element is stored in the local identifier, %a.addr. The alloca instruction is commonly used to represent local (automatic) variables. The %a and %b arguments are stored in the stack locations %a.addr and %b.addr by means of store instructions. The values are loaded back from the same memory locations by load instructions, and they are used in the addition, %add = add nsw i32 %0, %1. Finally, the addition result, %add, is returned by the function. The nsw flag specifies that this add operation has "no signed wrap", which indicates instructions that are known to have no overflow, allowing for some optimizations. If you are interested in the history behind the nsw flag, a worthwhile read is the LLVMdev post at http://lists.cs.uiuc.edu/pipermail/llvmdev/2011-November/045730.html by Dan Gohman. In fact, the load and store instructions are redundant, and the function arguments can be used directly in the add instruction. Clang uses -O0 (no optimizations) by default, and the unnecessary loads and stores are not removed. If we compile with -O1 instead, the outcome is a much simpler code, which is reproduced here: define i32 @sum(i32 %a, i32 %b) ... { entry: %add = add nsw i32 %b, %a ret i32 %add } ... Using the LLVM assembly directly is very handy when writing small examples to test target backends and as a means to learn basic LLVM concepts. However, a library is the recommended interface for frontend writers to build the LLVM IR, which is the subject of our next section. You can find a complete reference to the LLVM IR assembly syntax at http://llvm.org/docs/LangRef.html. Introducing the LLVM IR in-memory model The in-memory representation closely models the LLVM language syntax that we just presented. The header files for the C++ classes that represent the IR are located at include/llvm/IR. The following is a list of the most important classes: The Module class aggregates all of the data used in the entire translation unit, which is a synonym for "module" in LLVM terminology. It declares the Module::iterator typedef as an easy way to iterate across the functions inside this module. You can obtain these iterators via the begin() and end() methods. View its full interface at http://llvm.org/docs/doxygen/html/classllvm_1_1Module.html. The Function class contains all objects related to a function definition or declaration. In the case of a declaration (use the isDeclaration() method to check whether it is a declaration), it contains only the function prototype. In both cases, it contains a list of the function parameters accessible via the getArgumentList() method or the pair of arg_begin() and arg_end(). You can iterate through them using the Function::arg_iterator typedef. If your Function object represents a function definition, and you iterate through its contents via the for (Function::iterator i = function.begin(), e = function.end(); i != e; ++i) idiom, you will iterate across its basic blocks. View its full interface at http://llvm.org/docs/doxygen/html/classllvm_1_1Function.html. The BasicBlock class encapsulates a sequence of LLVM instructions, accessible via the begin()/end() idiom. You can directly access its last instruction using the getTerminator() method, and you also have a few helper methods to navigate the CFG, such as accessing predecessor basic blocks via getSinglePredecessor(), when the basic block has a single predecessor. However, if it does not have a single predecessor, you need to work out the list of predecessors yourself, which is also not difficult if you iterate through basic blocks and check the target of their terminator instructions. View its full interface at http://llvm.org/docs/doxygen/html/classllvm_1_1BasicBlock.html. The Instruction class represents an atom of computation in the LLVM IR, a single instruction. It has some methods to access high-level predicates, such as isAssociative(), isCommutative(), isIdempotent(), or isTerminator(), but its exact functionality can be retrieved with getOpcode(), which returns a member of the llvm::Instruction enumeration, which represents the LLVM IR opcodes. You can access its operands via the op_begin() and op_end() pair of methods, which are inherited from the User superclass that we will present shortly. View its full interface at http://llvm.org/docs/doxygen/html/classllvm_1_1Instruction.html. We have still not presented the most powerful aspect of the LLVM IR (enabled by the SSA form): the Value and User interfaces; these allow you to easily navigate the use-def and def-use chains. In the LLVM in-memory IR, a class that inherits from Value means that it defines a result that can be used by others, whereas a subclass of User means that this entity uses one or more Value interfaces. Function and Instruction are subclasses of both Value and User, while BasicBlock is a subclass of just Value. To understand this, let's analyze these two classes in depth: The Value class defines the use_begin() and use_end() methods to allow you to iterate through Users, offering an easy way to access its def-use chain. For every Value class, you can also access its name through the getName() method. This models the fact that any LLVM value can have a distinct identifier associated with it. For example, %add1 can identify the result of an add instruction, BB1 can identify a basic block, and myfunc can identify a function. Value also has a powerful method called replaceAllUsesWith(Value *), which navigates through all of the users of this value and replaces it with some other value. This is a good example of how the SSA form allows you to easily substitute instructions and write fast optimizations. You can view the full interface at http://llvm.org/docs/doxygen/html/classllvm_1_1Value.html. The User class has the op_begin() and op_end() methods that allows you to quickly access all of the Value interfaces that it uses. Note that this represents the use-def chain. You can also use a helper method called replaceUsesOfWith(Value *From, Value *To) to replace any of its used values. You can view the full interface at http://llvm.org/docs/doxygen/html/classllvm_1_1User.html. Summary In this article, we acquainted ourselves with the concepts and components related to the LLVM intermediate representation. Resources for Article: Further resources on this subject: Creating and Utilizing Custom Entities [Article] Getting Started with Code::Blocks [Article] Program structure, execution flow, and runtime objects [Article]
Read more
  • 0
  • 0
  • 18274

article-image-using-osgi-services
Packt
26 Aug 2014
14 min read
Save for later

Using OSGi Services

Packt
26 Aug 2014
14 min read
This article created by Dr Alex Blewitt the author of Mastering Eclipse Plug-in Development will present OSGi services as a means to communicate with and connect applications. Unlike the Eclipse extension point mechanism, OSGi services can have multiple versions available at runtime and can work in other OSGi environments, such as Felix or other commercial OSGi runtimes. (For more resources related to this topic, see here.) Overview of services In an Eclipse or OSGi runtime, each individual bundle is its own separate module, which has explicit dependencies on library code via Import-Package, Require-Bundle, or Require-Capability. These express static relationships and provide a way of configuring the bundle's classpath. However, this presents a problem. If services are independent, how can they use contributions provided by other bundles? In Eclipse's case, the extension registry provides a means for code to look up providers. In a standalone OSGi environment, OSGi services provide a similar mechanism. A service is an instance of a class that implements a service interface. When a service is created, it is registered with the services framework under one (or more) interfaces, along with a set of properties. Consumers can then get the service by asking the framework for implementers of that specific interface. Services can also be registered under an abstract class, but this is not recommended. Providing a service interface exposed as an abstract class can lead to unnecessary coupling of client to implementation. The following diagram gives an overview of services: This separation allows the consumer and producer to depend on a common API bundle, but otherwise be completely decoupled from one another. This allows both the consumer and producer to be mocked out or exchange with different implementations in the future. Registering a service programmatically To register a service, an instance of the implementation class needs to be created and registered with the framework. Interactions with the framework are performed with an instance of BundleContext—typically provided in the BundleActivator.start method and stored for later use. The *FeedParser classes will be extended to support registration as a service instead of the Equinox extension registry. Creating an activator A bundle's activator is a class that is instantiated and coupled to the lifetime of the bundle. When a bundle is started, if a manifest entry Bundle-Activator exists, then the corresponding class is instantiated. As long as it implements the BundleActivator interface, the start method will be called. This method is passed as an instance of BundleContext, which is the bundle's connection to the hosting OSGi framework. Create a class in the com.packtpub.e4.advanced.feeds project called com.packtpub.e4.advanced.feeds.internal.FeedsActivator, which implements the org.osgi.framework.BundleActivator interface. The quick fix may suggest adding org.osgi.framework as an imported package. Accept this, and modify the META-INF/MANIFEST.MF file as follows: Import-Package: org.osgi.framework Bundle-Activator: com.packtpub.e4.advanced.feeds.internal.FeedsActivator The framework will automatically invoke the start method of the FeedsActivator when the bundle is started, and correspondingly, the stop method when the bundle is stopped. Test this by inserting a pair of println calls: public class FeedsActivator implements BundleActivator { public void start(BundleContext context) throws Exception { System.out.println("Bundle started"); } public void stop(BundleContext context) throws Exception { System.out.println("Bundle stopped"); } } Now run the project as an OSGi framework with the feeds bundle, the Equinox console, and the Gogo shell. The required dependencies can be added by clicking on Add Required Bundles, although the Include optional dependencies checkbox does not need to be selected. Ensure that the other workspace and target bundles are deselected with the Deselect all button, as shown in the following screenshot: The required bundles are as follows: com.packtpub.e4.advanced.feeds org.apache.felix.gogo.command org.apache.felix.gogo.runtime org.apache.felix.gogo.shell org.eclipse.equinox.console org.eclipse.osgi On the console, when the bundle is started (which happens automatically if the Default Auto-Start is set to true), the Bundle started message should be seen. If the bundle does not start, ss in the console will print a list of bundles and start 2 will start the bundle with the ID 2. Afterwards, stop 2 can be used to stop bundle 2. Bundles can be stopped/started dynamically in an OSGi framework. Registering the service Once the FeedsActivator instance is created, a BundleContext instance will be available for interaction with the framework. This can be persisted for subsequent use in an instance field and can also be used directly to register a service. The BundleContext class provides a registerService method, which takes an interface, an instance, and an optional Dictionary instance of key/value pairs. This can be used to register instances of the feed parser at runtime. Modify the start method as follows: public void start(BundleContext context) throws Exception { context.registerService(IFeedParser.class, new RSSFeedParser(), null); context.registerService(IFeedParser.class, new AtomFeedParser(), null); context.registerService(IFeedParser.class, new MockFeedParser(), null); } Now start the framework again. In the console that is launched, look for the bundle corresponding to the feeds bundle: osgi> bundles | grep feeds com.packtpub.e4.advanced.feeds_1.0.0.qualifier [4] {com.packtpub.e4.advanced.feeds.IFeedParser}={service.id=56} {com.packtpub.e4.advanced.feeds.IFeedParser}={service.id=57} {com.packtpub.e4.advanced.feeds.IFeedParser}={service.id=58} This shows that bundle 4 has started three services, using the interface com.packtpub.e4.advanced.feeds.IFeedParser, and with service IDs 56, 57, and 58. It is also possible to query the runtime framework for services of a known interface type directly using the services command and an LDAP style filter: osgi> services (objectClass=com.packtpub.e4.advanced.feeds.IFeedParser) {com.packtpub.e4.advanced.feeds.IFeedParser}={service.id=56} "Registered by bundle:" com.packtpub.e4.advanced.feeds_1.0.0.qualifier [4] "No bundles using service." {com.packtpub.e4.advanced.feeds.IFeedParser}={service.id=57} "Registered by bundle:" com.packtpub.e4.advanced.feeds_1.0.0.qualifier [4] "No bundles using service." {com.packtpub.e4.advanced.feeds.IFeedParser}={service.id=58} "Registered by bundle:" com.packtpub.e4.advanced.feeds_1.0.0.qualifier [4] "No bundles using service." The results displayed represent the three services instantiated. They can be introspected using the service command passing the service.id: osgi> service 56 com.packtpub.e4.advanced.feeds.internal.RSSFeedParser@52ba638e osgi> service 57 com.packtpub.e4.advanced.feeds.internal.AtomFeedParser@3e64c3a osgi> service 58 com.packtpub.e4.advanced.feeds.internal.MockFeedParser@49d5e6da Priority of services Services have an implicit order, based on the order in which they were instantiated. Each time a service is registered, a global service.id is incremented. It is possible to define an explicit service ranking with an integer property. This is used to ensure relative priority between services, regardless of the order in which they are registered. For services with equal service.ranking values, the service.id values are compared. OSGi R6 adds an additional property, service.bundleid, which is used to denote the ID of the bundle that provides the service. This is not used to order services, and is for informational purposes only. Eclipse Luna uses OSGi R6. To pass a priority into the service registration, create a helper method called priority, which takes an int value and stores it in a Hashtable with the key service.ranking. This can be used to pass a priority to the service registration methods. The following code illustrates this: private Dictionary<String,Object> priority(int priority) { Hashtable<String, Object> dict = new Hashtable<String,Object>(); dict.put("service.ranking", new Integer(priority)); return dict; } public void start(BundleContext context) throws Exception { context.registerService(IFeedParser.class, new RSSFeedParser(), priority(1)); context.registerService(IFeedParser.class, new MockFeedParser(), priority(-1)); context.registerService(IFeedParser.class, new AtomFeedParser(), priority(2)); } Now when the framework starts, the services are displayed in order of priority: osgi> services | (objectClass=com.packtpub.e4.advanced.feeds.IFeedParser) {com.packtpub.e4.advanced.feeds.IFeedParser}={service.ranking=2, service.id=58} "Registered by bundle:" com.packtpub.e4.advanced.feeds_1.0.0.qualifier [4] "No bundles using service." {com.packtpub.e4.advanced.feeds.IFeedParser}={service.ranking=1, service.id=56} "Registered by bundle:" com.packtpub.e4.advanced.feeds_1.0.0.qualifier [4] "No bundles using service." {com.packtpub.e4.advanced.feeds.IFeedParser}={service.ranking=-1, service.id=57} "Registered by bundle:" com.packtpub.e4.advanced.feeds_1.0.0.qualifier [4] "No bundles using service." Dictionary was the original Java Map interface, and Hashtable the original HashMap implementation. They fell out of favor in Java 1.2 when Map and HashMap were introduced (mainly because they weren't synchronized by default) but OSGi was developed to run on early releases of Java (JSR 8 proposed adding OSGi as a standard for the Java platform). Not only that, early low-powered Java mobile devices didn't support the full Java platform, instead exposing the original Java 1.1 data structures. Because of this history, many APIs in OSGi refer to only Java 1.1 data structures so that low-powered devices can still run OSGi systems. Using the services The BundleContext instance can be used to acquire services as well as register them. FeedParserFactory, which originally used the extension registry, can be upgraded to refer to services instead. To obtain an instance of BundleContext, store it in the FeedsActivator.start method as a static variable. That way, classes elsewhere in the bundle will be able to acquire the context. An accessor method provides an easy way to do this: public class FeedsActivator implements BundleActivator { private static BundleContext bundleContext; public static BundleContext getContext() { return bundleContext; } public void start(BundleContext context) throws Exception { // register methods as before bundleContext = context; } public void stop(BundleContext context) throws Exception { bundleContext = null; } } Now the FeedParserFactory class can be updated to acquire the services. OSGi services are represented via a ServiceReference instance (which is a sharable object representing a handle to the service) and can be used to acquire a service instance: public class FeedParserFactory { public List<IFeedParser> getFeedParsers() { List<IFeedParser> parsers = new ArrayList<IFeedParser>(); BundleContext context = FeedsActivator.getContext(); try { Collection<ServiceReference<IFeedParser>> references = context.getServiceReferences(IFeedParser.class, null); for (ServiceReference<IFeedParser> reference : references) { parsers.add(context.getService(reference)); context.ungetService(reference); } } catch (InvalidSyntaxException e) { // ignore } return parsers; } } In this case, the service references are obtained from the bundle context with a call to context.getServiceReferences(IFeedParser.class,null). The service references can be used to access the service's properties, and to acquire the service. The service instance is acquired with the context.getService(ServiceReference) call. The contract is that the caller "borrows" the service, and when finished, should return it with an ungetService(ServiceReference) call. Technically, the service is only supposed to be used between the getService and ungetService calls as its lifetime may be invalid afterwards; instead of returning an array of service references, the common pattern is to pass in a unit of work that accepts the service and then call ungetService afterwards. However, to fit in with the existing API, the service is acquired, added to the list, and then released immediately afterwards. Lazy activation of bundles Now run the project as an Eclipse application, with the feeds and feeds.ui bundles installed. When a new feed is created by navigating to File | New | Other | Feeds | Feed, and a feed such as http://alblue.bandlem.com/atom.xml is entered, the feeds will be shown in the navigator view. When drilling down, a NullPointerException may be seen in the logs, as shown in the following: !MESSAGE An exception occurred invoking extension: com.packtpub.e4.advanced.feeds.ui.feedNavigatorContent for object com.packtpub.e4.advanced.feeds.Feed@770def59 !STACK 0 java.lang.NullPointerException at com.packtpub.e4.advanced.feeds.FeedParserFactory. getFeedParsers(FeedParserFactory.java:31) at com.packtpub.e4.advanced.feeds.ui.FeedContentProvider. getChildren(FeedContentProvider.java:80) at org.eclipse.ui.internal.navigator.extensions. SafeDelegateTreeContentProvider. getChildren(SafeDelegateTreeContentProvider.java:96) Tracing through the code indicates that the bundleContext is null, which implies that the feeds bundle has not yet been started. This can be seen in the console of the running Eclipse application by executing the following code: osgi> ss | grep feeds 866 ACTIVE com.packtpub.e4.advanced.feeds.ui_1.0.0.qualifier 992 RESOLVED com.packtpub.e4.advanced.feeds_1.0.0.qualifier While the feeds.ui bundle is active, the feeds bundle is not. Therefore, the services haven't been instantiated, and bundleContext has not been cached. By default, bundles are not started when they are accessed for the first time. If the bundle needs its activator to be called prior to using any of the classes in the package, it needs to be marked as having an activation policy of lazy. This is done by adding the following entry to the MANIFEST.MF file: Bundle-ActivationPolicy: lazy The manifest editor can be used to add this configuration line by selecting Activate this plug-in when one of its classes is loaded, as shown in the following screenshot: Now, when the application is run, the feeds will resolve appropriately. Comparison of services and extension points Both mechanisms (using the extension registry and using the services) allow for a list of feed parsers to be contributed and used by the application. What are the differences between them, and are there any advantages to one or the other? Both the registry and services approaches can be used outside of an Eclipse runtime. They work the same way when used in other OSGi implementations (such as Felix) and can be used interchangeably. The registry approach can also be used outside of OSGi, although that is far less common. The registry encodes its information in the plugin.xml file by default, which means that it is typically edited as part of a bundle's install (it is possible to create registry entries from alternative implementations if desired, but this rarely happens). The registry has a notification system, which can listen to contributions being added and removed. The services approach uses the OSGi framework to store and maintain a list of services. These services don't have an explicit configuration file and, in fact, can be contributed by code (such as the registerService calls) or by declarative representations. The separation of how the service is created versus how the service is registered is a key difference between the service and the registry approach. Like the registry, the OSGi services system can generate notifications when services come and go. One key difference in an OSGi runtime is that bundles depending on the Eclipse registry must be declared as singletons; that is, they have to use the ;singleton:=true directive on Bundle-SymbolicName. This means that there can only be one version of a bundle that exposes registry entries in a runtime, as opposed to multiple versions in the case of general services. While the registry does provide mechanisms to be able to instantiate extensions from factories, these typically involve simple configurations and/or properties that are hard-coded in the plugin.xml files themselves. They would not be appropriate to store sensitive details such as passwords. On the other hand, a service can be instantiated from whatever external configuration information is necessary and then registered, such as a JDBC connection for a database. Finally, extensions in the registry are declarative by default and are activated on demand. This allows Eclipse to start quickly because it does not need to build the full set of class loader objects or run code, and then bring up services on demand. Although the approach previously didn't use declarative services, it is possible to do this. Summary This article introduced OSGi services as a means to extend an application's functionality. It also shed light on how to register a service programmatically. Resources for Article: Further resources on this subject: Apache Maven and m2eclipse [article] Introducing an Android platform [article] Installing and Setting up JavaFX for NetBeans and Eclipse IDE [article]
Read more
  • 0
  • 0
  • 1019

article-image-nservicebus-architecture
Packt
25 Aug 2014
11 min read
Save for later

The NServiceBus Architecture

Packt
25 Aug 2014
11 min read
In this article by Rich Helton, the author of Mastering NServiceBus and Persistence, we will focus on the NServiceBus architecture. We will discuss the different message and storage types supported in NSB. This discussion will include an introduction to some of the tools and advantages of using NSB. We will conceptually look at how some of the pieces fit together while backing up the discussions with code examples. (For more resources related to this topic, see here.) NSB is the cornerstone of automation. As an Enterprise Service Bus (ESB), NSB is the most popular C# ESB solution. NSB is a framework that is used to provide many of the benefits of implementing a service-oriented architecture (SOA). It uses an IBus and its ESB bus to handle messages between NSB services, without having to create custom interaction. This type of messaging between endpoints creates the bus. The services, which are autonomous Windows processes, use both Windows and NSB hosting services. NSB-hosting services provide extra functionalities, such as creating endpoints; setting up Microsoft Queuing (MSMQ), DTC for transactions across queues, subscription storage for publish/subscribe message information, NSB sagas; and much more. Deploying these pieces for messaging manually can lead to errors and a lot of work is involved to get it correct. NSB takes care of provisioning its needed pieces. NSB is not a frontend framework, such as Microsoft's Model-View-Controller (MVC). It is not used as an Object-to-Relationship Mapper (ORM), such as Microsoft's Entity Frameworks, to map objects to SQL Server tables. It is also not a web service framework, such as Microsoft's Windows Communication Foundation (WCF). NSB is a framework to provide the communication and support for services to communicate with each other and provide an end-to-end workflow to process all of these pieces. Benefits of NSB NSB provides many components needed for automation that are only found in ESBs. ESBs provide the following: Separation of duties: From the frontend to the backend by allowing the frontend to fire a message to a service and continue with its processing not worrying about the results until it needs an update. Also, you can separate workflow responsibilities by separating NSB services. One service could be used to send payments to a bank, and another service can be used to provide feedback of the current status of the payment to the MVC-EF database so that a user may see the status of their payment. Message durability: Messages are saved in queues between services so that if the services are stopped, they can start from the messages saved in the queues when they are restarted. This is done so that the messages will persist, until told otherwise. Workflow retries: Messages, or endpoints, can be told to retry a number of times until they completely fail and send an error. The error is automated to return to an error queue. For instance, a web service message can be sent to a bank, and it can be set to retry the web service every 5 minutes for 20 minutes before giving up completely. This is useful while fixing any network or server issues. Monitoring: NSB's ServicePulse can keep a check on the heartbeat of its services. Other monitoring checks can be easily performed on NSB queues to report the number of messages. Encryption: Messages between services and endpoints can be easily encrypted. High availability: Multiple services, or subscribers, could be processing the same or similar messages from various services that live on different servers. When one server, or a service, goes down, others could be made available to take over that are already running. More on endpoints While working with a service-to-service interaction, messages are transmitted in the form of XML through queues that are normally part of Microsoft Server such as MSMQ, SQL Server such as SQL queuing, or even part of Microsoft Azure queues for cloud computing. There are other endpoints that services use to process resources that are not part of service-to-service communications. These endpoints are used to process commands and messages as well, for instance, sending a file to non-NSB-hosted services, sending SFTP files to non-NSB-hosted services, or sending web services, such as payments, to non-NSB services. While at the other end of these communications are non-NSB-hosted services, NSB offers a lot of integrity by checking how these endpoints were processed. NSB provides information on whether a web service was processed or not, with or without errors, and provides feedback and monitoring, and maintains the records through queues. It also provides saga patterns to provide feedback to the originating NSB services of the outcome while storing messages from a particular NSB service to the NSB service of everything that has happened. In many NSB services, an audit queue is used to keep a backup of each message that occurred successfully, and the error queue is used to keep track of any message that was not processed successfully. The application security perspective From the application security perspective, OWASP's top ten list of concerns, available at https://www.owasp.org/index.php/Top_10_2013-Top_10, seems to always surround injection, such as SQL injection, broken authentication, and cross-site scripting (XSS). Once an organization puts a product in production, they usually have policies in place for the company's security personnel to scan the product at will. Not all organizations have these policies in place, but once an organization attaches their product to the Internet, there are armies of hackers that may try various methods to attack the site, depending on whether there is money to be gained or not. Money comes in a new economy these days in the form of using a site as a proxy to stage other attacks, or to grab usernames and passwords that a user may have for a different system in order to acquire a user's identity or financial information. Many companies have suffered bankruptcy over the last decades thinking that they were secure. NSB offers processing pieces to the backend that would normally be behind a firewall to provide some protection. Firewalls provide some protection as well as Intrusion Detection Systems (IDSes), but there is so much white noise for viruses and scans that many real hack attacks may go unnoticed, except by very skilled antihackers. NSB offers additional layers of security by using queuing and messaging. The messages can be encrypted, and the queues may be set for limited authorization from production administrators. NSB hosting versus self-hosting NServiceBus.Host is an executable that will deploy the NSB service. When the NSB service is compiled, it turns into a Windows DLL that may contain all the configuration settings for the IBus. If there are additional settings needed for the endpoint's configuration that are not coded in the IBus's configuration, then it can be resolved by setting these configurations in the Host command. However, NServiceBus.Host need not be used to create the program that is used in NServiceBus. As a developer, you can create a console program that is run by a Window's task scheduler, or even create your own services that run the NSB IBus code as an endpoint. Not using the NSB-hosting engine is normally referred to as self-hosting. The NServiceBus host streamlines service development and deployment, allows you to change technologies without code, and is administrator friendly when setting permissions and accounts. It will deploy your application as an NSB-hosted solution. It can also add configurations to your program at the NServiceBus.Host.exe command line. If you develop a program with the NServiceBus.Host reference, you can use EndpoinConfig.cs to define your IBus configuration in this code, or add it as part of the command line instead of creating your own Program.cs that will do a lot of the same work with more code. When debugging with the NServiceBus.Host reference, the Visual Studio project is creating a windows DLL program that is run by the NserviceBus.Host.exe command. Here's an example form of the properties of a Visual Studio project: The NServiceBus.Host.exe command line has support for deploying Window's services as NSB-hosted services: These configurations are typically referred to as the profile for which the service will be running. Here are some of the common profiles: MultiSite: This turns on the gateway. Master: This makes the endpoint a "master node endpoint". This means that it runs the gateway for multisite interaction, the timeout manager, and the distributor. It also starts a worker that is enlisted with the distributor. It cannot be combined with the worker or distributor profiles. Worker: This makes the current endpoint enlist as a worker with its distributor running on the master node. It cannot be combined with the master or distributor profiles. Distributor: This starts the endpoint only as a distributor. This means that the endpoint does no actual work and only distributes the load among its enlisted workers. It cannot be combined with the Master and Worker profiles. Performance counters: This turns on the NServiceBus-specific performance counters. Performance counters are installed by default when you run a Production profile. Lite: This keeps everything in memory with the most detailed logging. Integration: This uses technologies closer to production but without a scale-out option and less logging. It is used in testing. Production: This uses scale-out-friendly technologies and minimal file-based logging. It is used in production. Using Powershell commands Many items can be managed in the Package Manager console program of Visual Studio 2012. Just as we add commands to the NServiceBus.Host.exe file to extend profiles and configurations, we may also use VS2012 Package Manager to extend some of the functionalities while debugging and testing. We will use the ScaleOut solution discussed later just to double check that the performance counters are installed correctly. We need to make sure that the PowerShell commandlets are installed correctly first. We do this by using Package Manager: Install the package, NServiceBus.PowerShell Import the module, .packagesNServiceBus.PowerShell.4.3.0libnet40NServiceBus.PowerShell.dll Test NServiceBusPerformanceCountersInstallation The "Import module" step is dependent on where NService.PowerShell.dll was installed during the "Install package" process. The "Install-package" command will add the DLL into a package directory related to the solution. We can find out more on PowerShell commandlets at http://docs.particular.net/nservicebus/managing-nservicebus-using-powershell and even by reviewing the help section of Package Manager. Here, we see that we can insert configurations into App.config when we look at the help section, PM> get-help about_NServiceBus. Message exchange patterns Let's discuss the various exchange patterns now. The publish/subscribe pattern One of the biggest benefits of using the ESB technology is the benefits of the publish/subscribe message pattern; refer to http://en.wikipedia.org/wiki/Publish-subscribe_pattern. The publish/subscribe pattern has a publisher that sends messages to a queue, say a MSMQ MyPublisher queue. Subscribers, say Subscriber1 and Subscriber2, will listen for messages on the queue that the subscribers are defined to take from the queue. If MyPublisher cannot process the messages, it will return them to the queue or to an error queue, based on the reasons why it could not process the message. The queue that the subscribers are looking for on the queue are called endpoint mappings. The publisher endpoint mapping is usually based on the default of the project's name. This concept is the cornerstone to understand NSB and ESBs. No messages will be removed, unless they are explicitly told to be removed by a service. Therefore, no messages will be lost, and all are accounted for from the services. The configuration data is saved to the database. Also, the subscribers can respond back to MyPublisher with messages indicating that everything was alright or not using the queue. So why is this important? It's because all the messages can then be accounted for, and feedback can be provided to all the services. A service is a Windows service that is created and hosted by the NSB host program. It could also be a Windows command console program or even an MVC program, but the service program is always up and running on the server, continuously checking queues and messages that are sent to it from other endpoints. These messages could be commands, such as instructions to go and look at the remote server to see whether it is still running, or data messages such as sending a particular payment to the bank through a web service. For NSB, we formalize that events are used in publish/subscribe, and commands are used in a request-response message exchange pattern. Windows Server could have too many services, so some of these services could just be standing by, waiting to take over if one service is not responding or processing messages simultaneously. This provides a very high availability.
Read more
  • 0
  • 0
  • 3977

article-image-new-functionality-opencv-30
Packt
25 Aug 2014
5 min read
Save for later

New functionality in OpenCV 3.0

Packt
25 Aug 2014
5 min read
In this article by Oscar Deniz Suarez, coauthor of the book OpenCV Essentials, we will cover the forthcoming Version 3.0, which represents a major evolution of the OpenCV library for Computer Vision. Currently, OpenCV already includes several new techniques that are not available in the latest official release (2.4.9). The new functionality can be already used by downloading and compiling the latest development version from the official repository. This article provides an overview of some of the new techniques implemented. Other numerous lower-level changes in the forthcoming Version 3.0 (updated module structure, C++ API changes, transparent API for GPU acceleration, and so on) are not discussed. (For more resources related to this topic, see here.) Line Segment Detector OpenCV users have had the Hough transform-based straight line detector available in the previous versions. An improved method called Line Segment Detector (LSD) is now available. LSD is based on the algorithm described at http://dx.doi.org/10.5201/ipol.2012.gjmr-lsd. This method has been shown to be more robust and faster than the best previous Hough-based detector (the Progressive Probabilistic Hough Transform). The detector is now part of the imgproc module. OpenCV provides a short sample code ([opencv_source_code]/samples/cpp/lsd_lines.cpp), which shows how to use the LineSegmentDetector class. The following table shows the main components of the class: Method Function <constructor> The constructor allows to enter parameters of the algorithm; particularly; the level of refinements we want in the result detect This method detects line segments in the image drawSegments This method draws the segments in a given image compareSegments This method draws two sets of segments in a given image. The two sets are drawn with blue and red color lines Connected components The previous versions of OpenCV have included functions for working with image contours. Contours are the external limits of connected components (that is, regions of connected pixels in a binary image). The new functions, connectedComponents and connectedComponentsWithStats retrieve connected components as such. The connected components are retrieved as a labeled image with the same dimensions as the input image. This allows drawing the components on the original image easily. The connectedComponentsWithStats function retrieves useful statistics about each component shown in the following table: CC_STAT_LEFT  The leftmost (x) coordinate, which is the inclusive start of the bounding box in the horizontal direction CC_STAT_TOP  The topmost (y) coordinate, which is the inclusive start of the bounding box in the vertical direction CC_STAT_WIDTH  The horizontal size of the bounding box CC_STAT_HEIGHT  The vertical size of the bounding box CC_STAT_AREA  The total area (in pixels) of the connected component Scene text detection Text recognition is a classic problem in Computer Vision. Thus, Optical Character Recognition (OCR) is now routinely used in our society. In OCR, the input image is expected to depict typewriter black text over white background. In the last years, researchers aim at the more challenging problem of recognizing text "in the wild" on street signs, indoor signs, with diverse backgrounds and fonts, colors, and so on. The following figure shows and example of the difference between the two scenarios. In this scenario, OCR cannot be applied to the input images. Consequently, text recognition is actually accomplished in two steps. The text is first localized in the image and then character or word recognition is performed on the cropped region. OpenCV now provides a scene text detector based on the algorithm described in Neumann L., Matas J.: Real-Time Scene Text Localization and Recognition, CVPR 2012 (Providence, Rhode Island, USA). The implementation of OpenCV makes use of additional improvements found at http://158.109.8.37/files/GoK2013.pdf. OpenCV includes an example ([opencv_source_code]/samples/cpp/textdetection.cpp) that detects and draws text regions in an input image. The KAZE and AKAZE features Several 2D features have been proposed in the computer vision literature. Generally, the two most important aspects in feature extraction algorithms are computational efficiency and robustness. One of the latest contenders is the KAZE (Japanese word meaning "Wind") and Accelerated-KAZE (AKAZE) detector. There are reports that show that KAZE features are both robust and efficient, as compared with other widely-known features (BRISK, FREAK, and so on). The underlying algorithm is described in KAZE Features, Pablo F. Alcantarilla, Adrien Bartoli, and Andrew J. Davison, in European Conference on Computer Vision (ECCV), Florence, Italy, October 2012. As with other keypoint detectors in OpenCV, the KAZE implementation allows retrieving both keypoints and descriptors (that is, a feature vector computed around the keypoint neighborhood). The detector follows the same framework used in OpenCV for other detectors, so drawing methods are also available. Computational photography One of the modules with most improvements in the forthcoming Version 3.0 is the computational photography module (photo). The new techniques include the functionalities mentioned in the following table: Functionality Description HDR imaging Functions for handling High-Dynamic Range images (tonemapping, exposure alignment, camera calibration with multiple exposures, and exposure fusion) Seamless cloning Functions for realistically inserting one image into other image with an arbitrary-shape region of interest. Non-photorealistic rendering This technique includes non-photorealistic filters (such as pencil-like drawing effect) and edge-preserving smoothing filters (those are similar to the bilateral filter). New modules Finally, we provide a list with the new modules in development for version 3.0: Module name Description videostab Global motion estimation, Fast Marching method softcascade Implements a stageless variant of the cascade detector, which is considered more accurate shape Shape matching and retrieval. Shape context descriptor and matching algorithm, Hausdorff distance and Thin-Plate Splines cuda<X> Several modules with CUDA-accelerated implementations of other functions in the library Summary In this article, we learned about the different functionalities in OpenCV 3.0 and its different components. Resources for Article: Further resources on this subject: Wrapping OpenCV [article] A quick start – OpenCV fundamentals [article] Linking OpenCV to an iOS project [article]
Read more
  • 0
  • 0
  • 6409
article-image-bpms-components
Packt
19 Aug 2014
8 min read
Save for later

BPMS Components

Packt
19 Aug 2014
8 min read
In this article by Mariano Nicolas De Maio, the author of jBPM6 Developer Guide, we will look into the various components of a Business Process Management (BPM) system. (For more resources related to this topic, see here.) BPM systems are pieces of software created with the sole purpose of guiding your processes through the BPM cycle. They were originally monolithic systems in charge of every aspect of a process, where they had to be heavily migrated from visual representations to executable definitions. They've come a long way from there, but we usually relate them to the same old picture in our heads when a system that runs all your business processes is mentioned. Nowadays, nothing is further from the truth. Modern BPM Systems are not monolithic environments; they're coordination agents. If a task is finished, they will know what to do next. If a decision needs to be made regarding the next step, they manage it. If a group of tasks can be concurrent, they turn them into parallel tasks. If a process's execution is efficient, they will perform the processing 0.1 percent of the time in the process engine and 99.9 percent of the time on tasks in external systems. This is because they will have no heavy executions within, only derivations to other systems. Also, they will be able to do this from nothing but a specific diagram for each process and specific connectors to external components. In order to empower us to do so, they need to provide us with a structure and a set of tools that we'll start defining to understand how BPM systems' internal mechanisms work, and specifically, how jBPM6 implements these tools. Components of a BPMS All big systems become manageable when we divide their complexities into smaller pieces, which makes them easier to understand and implement. BPM systems apply this by dividing each function in a different module and interconnecting them within a special structure that (in the case of jBPM6) looks something like the following figure: BPMS' internal structure Each component in the preceding figure resolves one particular function inside the BPMS architecture, and we'll see a detailed explanation on each one of them. The execution node The execution node, as seen from a black box perspective, is the component that receives the process definitions (a description of each step that must be followed; from here on, we'll just refer to them as processes). Then, it executes all the necessary steps in the established way, keeping track of each step, variable, and decision that has to be taken in each process's execution (we'll start calling these process instances). The execution node along with its modules are shown in the following figure: The execution node is composed of a set of low-level modules: the semantic module and the process engine. The semantic module The semantic module is in charge of defining each of the specific language semantics, that is, what each word means and how it will be translated to the internal structures that the process engine can execute. It consists of a series of parsers to understand different languages. It is flexible enough to allow you to extend and support multiple languages; it also allows the user to change the way already defined languages are to be interpreted for special use cases. It is a common component of most of the BPMSes out there, and in jBPM6, it allows you to add the extensions of the process interpretations to the module. This is so that you can add your own language parsers, and define your very own text-based process definition language or extend existing ones. The process engine The process engine is the module that is in charge of the actual execution of our business processes. It creates new process instances and keeps track of their state and their internal steps. Its job is to expose methods to inject process definitions and to create, start, and continue our process instances. Understanding how the process engine works internally is a very important task for the people involved in BPM's stage 4, that is, runtime. This is where different configurations can be used to improve performance, integrate with other systems, provide fault tolerance, clustering, and many other functionalities. Process Engine structure In the case of jBPM6, process definitions and process instances have similar structures but completely different objectives. Process definitions only show the steps it should follow and the internal structures of the process, keeping track of all the parameters it should have. Process instances, on the other hand, should carry all of the information of each process's execution, and have a strategy for handling each step of the process and keep track of all its actual internal values. Process definition structures These structures are static representations of our business processes. However, from the process engine's internal perspective, these representations are far from the actual process structure that the engine is prepared to handle. In order for the engine to get those structures generated, it requires the previously described semantic module to transform those representations into the required object structure. The following figure shows how this parsing process happens as well as the resultant structure: Using a process modeler, business analysts can draw business processes by dragging-and-dropping different activities from the modeler palette. For jBPM6, there is a web-based modeler designed to draw Scalable Vector Graphics (SVG) files; this is a type of image file that has the particularity of storing the image information using XML text, which is later transformed into valid BPMN2 files. Note that both BPMN2 and jBPM6 are not tied up together. On one hand, the BPMN2 standard can be used by other process engine provides such as Activiti or Oracle BPM Suite. Also, because of the semantic module, jBPM6 could easily work with other parsers to virtually translate any form of textual representation of a process to its internal structures. In the internal structures, we have a root component (called Process in our case, which is finally implemented in a class called RuleFlowProcess) that will contain all the steps that are represented inside the process definition. From the jBPM6 perspective, you can manually create these structures using nothing but the objects provided by the engine. Inside the jBPM6-Quickstart project, you will find a code snippet doing exactly this in the createProcessDefinition() method of the ProgrammedProcessExecutionTest class: //Process Definition RuleFlowProcess process = new RuleFlowProcess(); process.setId("myProgramaticProcess"); //Start Task StartNode startTask = new StartNode(); startTask.setId(1); //Script Task ActionNode scriptTask = new ActionNode(); scriptTask.setId(2); DroolsAction action = new DroolsAction(); action.setMetaData("Action", new Action() { @Override public void execute(ProcessContext context) throws Exception { System.out.println("Executing the Action!!"); } }); scriptTask.setAction(action); //End Task EndNode endTask = new EndNode(); endTask.setId(3); //Adding the connections to the nodes and the nodes to the processes new ConnectionImpl(startTask, "DROOLS_DEFAULT", scriptTask, "DROOLS_DEFAULT"); new ConnectionImpl(scriptTask, "DROOLS_DEFAULT", endTask, "DROOLS_DEFAULT"); process.addNode(startTask); process.addNode(scriptTask); process.addNode(endTask); Using this code, we can manually create the object structures to represent the process shown in the following figure: This process contains three components: a start node, a script node, and an end node. In this case, this simple process is in charge of executing a simple action. The start and end tasks simply specify a sequence. Even if this is a correct way to create a process definition, it is not the recommended one (unless you're making a low-level functionality test). Real-world, complex processes are better off being designed in a process modeler, with visual tools, and exported to standard representations such as BPMN 2.0. The output of both the cases is the same; a process object that will be understandable by the jBPM6 runtime. While we analyze how the process instance structures are created and how they are executed, this will do. Process instance structures Process instances represent the running processes and all the information being handled by them. Every time you want to start a process execution, the engine will create a process instance. Each particular instance will keep track of all the activities that are being created by its execution. In jBPM6, the structure is very similar to that of the process definitions, with one root structure (the ProcessInstance object) in charge of keeping all the information and NodeInstance objects to keep track of live nodes. The following code shows a simplification of the methods of the ProcessInstance implementation: public class RuleFlowProcessInstance implements ProcessInstance { public RuleFlowProcess getRuleFlowProcess() { ... } public long getId() { ... } public void start() { ... } public int getState() { ... } public void setVariable(String name, Object value) { ... } public Collection<NodeInstance> getNodeInstances() { ... } public Object getVariable(String name) { ... } } After its creation, the engine calls the start() method of ProcessInstance. This method seeks StartNode of the process and triggers it. Depending on the execution of the path and how different nodes connect between each other, other nodes will get triggered until they reach a safe state where the execution of the process is completed or awaiting external data. You can access the internal parameters that the process instance has through the getVariable and setVariable methods. They provide local information from the particular process instance scope. Summary In this article, we saw what are the basic components required to set up a BPM system. With these components in place, we are ready to explore, in more detail, the structure and working of a BPM system. Resources for Article: Further resources on this subject: jBPM for Developers: Part 1 [Article] Configuring JBoss Application Server 5 [Article] Boss jBPM Concepts and jBPM Process Definition Language (jPDL) [Article]
Read more
  • 0
  • 0
  • 2858

article-image-program-structure-execution-flow-and-runtime-objects
Packt
30 Jul 2014
8 min read
Save for later

Program structure, execution flow, and runtime objects

Packt
30 Jul 2014
8 min read
(For more resources related to this topic, see here.) Unfortunately, C++ isn't an easy language at all. Sometimes, you can think that it is a limitless language that cannot be learned and understood entirely, but you don't need to worry about that. It is not important to know everything; it is important to use the parts that are required in specific situations correctly. Practice is the best teacher, so it's better to understand how to use as many of the features as needed. In examples to come, we will use author Charles Simonyi's Hungarian notation. In his PhD in 1977, he used Meta-Programming – A software production method to make a standard for notation in programming, which says that the first letter of type or variable should represent the data type. For the example class that we want to call, a Test data type should be CTest, where the first letter says that Test is a class. This is good practice because a programmer who is not familiar with the Test data type will immediately know that Test is a class. The same standard stands for primitive types, such as int or double. For example, iCount stands for an integer variable Count, while dValue stands for a double variable Value. Using the given prefixes, it is easy to read code even if you are not so familiar with it. Getting ready Make sure Visual Studio is up and running. How to do it... Now, let's create our first program by performing the following steps and explaining its structure: Create a new C++ console application named TestDemo. Open TestDemo.cpp. Add the following code: #include "stdafx.h" #include <iostream> using namespace std; int _tmain(int argc, _TCHAR* argv[]) { cout << "Hello world" << endl; return 0; } How it works... The structure of the C++ program varies due to different programming techniques. What most programs must have is the #include or preprocessor directives. The #include <iostream> header tells the compiler to include the iostream.h header file where the available function prototypes reside. It also means that libraries with the functions' implementations should be built into executables. So, if we want to use some API or function, we need to include an appropriate header file, and maybe we will have to add an additional input library that contains the function/API implementation. One more important difference when including files is <header> versus "header". The first (<>) targets solution-configured project paths, while the other ("") targets folders relative to the C++ project. The using command instructs the compiler to use the std namespace. Namespaces are packages with object declarations and function implementations. Namespaces have a very important usage. They are used to minimize ambiguity while including third-party libraries when the same function name is used in two different packages. We need to implement a program entry point—the main function. As we said before, we can use main for an ANSI signature, wmain for a Unicode signature, or _tmain where the compiler will resolve its signature depending on the preprocessor definitions in the project property pages. For a console application, the main function can have the following four different prototypes: int _tmain(int argc, TCHAR* argv[]) void _tmain(int argc, TCHAR* argv[]) int _tmain(void) void _tmain(void) The first prototype has two arguments, argc and argv. The first argument, argc, or the argument count, says how many arguments are present in the second argument, argv, or the argument values. The argv parameter is an array of strings, where each string represents one command-line argument. The first string in argv is always the current program name. The second prototype is the same as the first one, except the return type. This means that the main function may or may not return a value. This value is returned to the OS. The third prototype has no arguments and returns an integer value, while the fourth prototype neither has arguments nor returns a value. It is good practice to use the first format. The next command uses the cout object. The cout object is the name of the standard output stream in C++, and the meaning of the entire statement is to insert a sequence of characters (in this case, the Hello world sequence of characters) into the standard output stream (usually corresponds to the screen). The cout object is declared in the iostream standard file within the std namespace. This is why we need to include the specific file and declare that we will use this specific namespace earlier in our code. In our usual selection of the (int _tmain(int,_TCHAR**)) prototype, _tmain returns an integer. We must specify some int value after the return command, in this case 0. When returning a value to the operating system, 0 usually means success, but this is operating system dependent. This simple program is very easy to create. We use this simple example to demonstrate the basic program structure and usage of the main routine as the entry point for every C++ program. Programs with one thread are executed sequentially line-by-line. This is why our program is not user friendly if we place all code into one thread. After the user gives the input, control is returned to the application. Only now can the application continue the execution. In order to overcome such an issue, we can create concurrent threads that will handle the user input. In this way, the application does not stall and is responsive all the time. After a thread handles its task, it can signal that a user has performed the requested operation to the application. There's more... Every time we have an operation that needs to be executed separately from the main execution flow, we have to think about a separate thread. The simplest example is when we have some calculations, and we want to have a progress bar where the calculation progress will be shown. If the same thread was responsible for calculation as well as to update the progress bar, probably it wouldn't work. This occurs because, if both work and a UI update are performed from a single thread, such threads can't interact with OS painting adequately; so almost always, the UI thread is separate from the working threads. Let's review the following example. Assume that we have created a function where we will calculate something, for example, sines or cosines of some angle, and we want to display progress in every step of the calculation: void CalculateSomething(int iCount) { int iCounter = 0; while (iCounter++ < iCount) { //make calculation //update progress bar } } As commands are executed one after another inside each iteration of the while loop, the operating system doesn't have the required time to properly update the user interface (in this case, the progress bar), so we would see an empty progress bar. After the function returns, a fully filled progress bar appears. The solution for this is to create a progress bar on the main thread. A separate thread should execute the CalculateSomething function; in each step of iteration, it should somehow signal the main thread to update the progress bar step by step. As we said before, threads are switched extremely fast on the CPU, so we can get the impression that the progress bar is updated at the same time at the calculation is performed. To conclude, each time we have to make a parallel task, wait for some kind of user input, or wait for an external dependency such as some response from a remote server, we will create a separate thread so that our program won't hang and be unresponsive. In our future examples, we will discuss static and dynamic libraries, so let's say a few words about both. A static library (*.lib) is usually some code placed in separate files and already compiled for future use. We will add it to the project when we want to use some of its features. When we wrote #include <iostream> earlier, we instructed the compiler to include a static library where implementations of input-output stream functions reside. A static library is built into the executable in compile time before we actually run the program. A dynamic library (*.dll) is similar to a static library, but the difference is that it is not resolved during compilation; it is linked later when we start the program, in another words—at runtime. Dynamic libraries are very useful when we have functions that a lot of programs will use. So, we don't need to include these functions into every program; we will simply link every program at runtime with one dynamic library. A good example is User32.dll, where the Windows OS placed a majority of GUI functions. So, if we create two programs where both have a window (GUI form), we do not need to include CreateWindow in both programs. We will simply link User32.dll at runtime, and the CreateWindow API will be available. Summary Thus the article covers the four main paradigms: imperative, declarative, functional (or structural), and object-oriented. Resources for Article: Further resources on this subject: OpenGL 4.0: Building a C++ Shader Program Class [Article] Application Development in Visual C++ - The Tetris Application [Article] Building UI with XAML for Windows 8 Using C [Article]
Read more
  • 0
  • 0
  • 2698

article-image-net-framework-primer
Packt
22 Jul 2014
17 min read
Save for later

The .NET Framework Primer

Packt
22 Jul 2014
17 min read
(For more resources related to this topic, see here.) An evaluation framework for .NET Framework APIs Understanding the .NET Framework in its entirety, including keeping track of the APIs that are available in various versions (for example, 3.5, 4, 4.5, 4.5.1, and so on, and platforms such as Windows 8, Windows Phone 8, and Silverlight 5) is a near impossible undertaking. What software developers and architects need is a high-level framework to logically partition the .NET Framework and identify the APIs that should be used to address a given requirement or category of requirements. API boundaries in the .NET Framework can be a little fuzzy. Some logical APIs span multiple assemblies and namespaces. Some are nicely contained within a neat hierarchy within a single root namespace. To confuse matters even further, single assemblies might contain portions of multiple APIs. The most practical way to distinguish an API is to use the API's root namespace or the namespace that contains the majority of the API's implementation. We will point out the cases where an API spans multiple namespaces or there are peculiarities in the namespaces of an API. Evaluation framework dimensions The dimensions for .NET Framework API evaluation framework are as follows: Maturity: This dimension indicates how long the API has been available, how long it has been part of the .NET Framework, and what the API's expected longevity is. It is also a measure of how relevant the API is or an indication that an API has been subsumed by newer and better APIs. Productivity: This dimension is an indication of how the use of the API will impact developer productivity. This dimension is measured by how easy the API is to learn and use, how well known the API is within the developer community, how simple or complex it is to use, the richness of the tool support (primarily in Visual Studio), and how abstract the API is, that is, whether it is declarative or imperative. Performance: This dimension indicates whether the API was designed with performance, resource utilization, user interface responsiveness, or scalability in mind; alternatively, it indicates whether convenience, ease of use, or code pithiness were the primary design criteria, which often comes at the expense of the former. Availability: This dimension indicates whether the API is available only on limited versions of the .NET Framework and Microsoft operating systems, or whether it is available everywhere that managed code is executed, including third-party implementations on non-Microsoft operating systems, for example, Mono on Linux. Evaluation framework ratings Each dimension of the API evaluation framework is given a four-level rating. Let's take a look at the ratings for each of the dimensions. The following table describes the ratings for Maturity: Rating Glyph Description Emerging This refers to a new API that was either added to the .NET Framework in the last release or is a candidate for addition in an upcoming release that has not gained widespread adoption yet. This also includes APIs that are not officially part of the .NET Framework. New and promising This is an API that has been in the .NET Framework for a couple of releases; it is already being used by the community in production systems, but it has yet to hit the mainstream. This rating may also include Microsoft APIs that are not officially part of .NET, but show a great deal of promise, or are being used extensively in production. Tried and tested This is an API that has been in the .NET Framework for multiple releases, has attained very broad adoption, has been refined and improved with each release, and is probably not going to be subsumed by a new API or deprecated in a later version of the Framework. Showing its age The API is no longer relevant or has been subsumed by a superior API, entirely deprecated in recent versions of .NET, or metamorphosedmerged into a new API. The following table describes the ratings for Productivity: Rating Glyph Description Decrease This is a complex API that is difficult to learn and use and not widely understood within the .NET developer community. Typically, these APIs are imperative, that is, they expose the underlying plumbing that needs to be understood to correctly use the API, and there is little or no tooling provided in Visual Studio. Using this API results in lowered developer productivity. No or little impact This API is fairly well known and used by the .NET developer community, but its use will have little effect on productivity, either because of its complexity, steep learning curve, and lack of tool support, or because there is simply no alternative API Increase This API is well known and used by the .NET developer community, is easy to learn and use, has good tool support, and is typically declarative; that is, the API allows the developer to express the behavior they want without requiring an understanding of the underlying plumbing, and in minimal lines of code too. Significant increase This API is very well known and used in the .NET developer community, is very easy to learn, has excellent tool support, and is declarative and pithy. Its use will significantly improve developer productivity. Performance and Scalability Rating Glyph Description Decrease The API was designed for developer productivity or convenience and will more than likely result in the slower execution of code and the increased usage of system resources (when compared to the use of other .NET APIs that provide the same or similar capabilities). Do not use this API if performance is a concern. No or little impact The API strikes a good balance between performance and developer productivity. Using it should not significantly impact the performance or scalability of your application. If performance is a concern, you can use the API, but do so with caution and make sure you measure its impact. Increase The API has been optimized for performance or scalability, and it generally results in faster, more scalable code that uses fewer system resources. It is safe to use in performance-sensitive code paths if best practices are followed. Significant increase The API was designed and written from the ground up with performance and scalability in mind. The use of this API will result in a significant improvement of performance and scalability over other APIs. The following table describes the ratings for Availability: Rating Glyph Description Rare The API is available in limited versions of the .NET Framework and on limited operating systems. Avoid this API if you are writing code that needs to be portable across all platforms. Limited This API is available on most versions of the .NET Framework and Microsoft operating systems. It is generally safe to use, unless you are targeting very old versions of .NET or Microsoft operating systems. Microsoft Only This API is available on all versions of the .NET Framework and all Microsoft operating systems. It is safe to use if you are on the Microsoft platform and are not targeting third-party CLI implementations, such as Mono.   Universal The API is available on all versions of .NET, including those from third parties, and it is available on all operating systems, including non-Microsoft operating systems. It is always safe to use this API. The .NET Framework The rest of this article will highlight some of the more commonly used APIs within the .NET Framework and rate each of these APIs using the Evaluation framework described previously. The Base Class Library The Base Class Library (BCL) is the heart of the .NET Framework. It contains base types, collections, and APIs to work with events and attributes; console, file, and network I/O; and text, XML, threads, application domains, security, debugging, tracing, serialization, interoperation with native COM and Win32 APIs, and the other core capabilities that most .NET applications need. The BCL is contained within the mscorlib.dll, System.dll, and System.Core.dll assemblies The mscorlib.dll assembly is loaded during the CLR Bootstrap(not by the CLR Loader), contains all non optional APIs and types, and is universally available in every .NET process, such as Silverlight, Windows Phone, and ASP.NET. Optional BCL APIs and types are available in System.dll and System.Core.dll, which are loaded on demand by the CLR Loader, as with all other managed assemblies. It would be a rare exception, however, when a .NET application does not use either of these two aforementioned assemblies since they contain some very useful APIs. When creating any project type in Visual Studio, these assemblies will be referenced by default. For the purpose of this framework, we will treat all of the BCL as a logical unit and not differentiate the nonoptional APIs (that is, the ones contained within mscorlib.dll), from the optional ones. Despite being a significant subset of the .NET Framework, the BCL contains a significant number of namespaces and APIs. The next sections describe a partial list of some of the more notable namespaces/APIs within the BCL, with an evaluation for each: System namespace System.Text namespace System.IO namespace System.Net namespace System.Collections namespace System.Collections.Generic namespace System.Collections.Concurrent namespace System.Linq namespace System.Xml namespace System.Xml.Linq namespace System.Security.Cryptography namespace System.Threading namespace System.Threading.Tasks namespace System.ServiceProcess namespace System.ComponentModel.Composition namespace System.ComponentModel.DataAnnotations namespace ADO.NET Most computer programs are meaningless without appropriate data to operate over. Accessing this data in an efficient way has become one of the greatest challenges modern developers face as the datasets have grown in size, from megabytes, to gigabytes, to terabytes, and now petabytes, in the most extreme cases, for example, Google's search database is around a petabyte. Though relational databases no longer hold the scalability high ground, a significant percentage of the world's data still resides in them and will probably continue to do so for the foreseeable future. ADO.NET contains a number of APIs to work with relational data and data provider APIs to access Microsoft SQL Server, Oracle Database, OLEDB, ODBC, and SQL Server Compact Edition. System.Data namespace System.Data.Entity namespace System.Data.Linq namespace System.Data.Services namespace Windows Forms Windows Forms (WinForms) was the original API for developing the user interface (UI) of Windows desktop applications with the .NET Framework. It was released in the first version of .NET and every version since then. System.Windows.Forms namespace The WinForms API is contained within the System.Windows.Forms namespace. Though WinForms is a managed API, it is actually a fairly thin façade over earlier, unmanaged APIs, primarily Win32 and User32, and any advanced use of WinForms requires a good understanding of these underlying APIs. The advanced customizations of WinForms controls often require the use of the System.Drawing API, which is also just a managed shim over the unmanaged GDI+ API. Many new applications are still developed using WinForms, despite its age and the alternative .NET user interface APIs that are available. It is a very well understood API, is very stable, and has been optimized for performance (though it is not GPU-accelerated like WPF or WinRT). There are a significant number of vendors who produce feature-rich, high-quality, third-party WinForms controls, and WinForms is available in every version of .NET and on most platforms, including Mono. WinForms is clearly showing its age, particularly when its capabilities are compared to those of WPF and WinRT, but it is still a viable API for applications that exclusively target the desktop and where a sophisticated modern UI is not necessary. The following table shows the evaluation of the System.Windows.Forms namespace: Maturity Productivity Performance Availability Windows Presentation Foundation Windows Presentation Foundation (WPF) is an API, introduced in .NET 3.0, for developing rich user interfaces for .NET applications, with no dependencies on legacy Windows APIs and with support for GPU-accelerated 3D rendering, animation, and media playback. If you want to play a video on a clickable button control on the surface of an animated, 3D rotating cube and the only C# code you want to write is the button click event handler, then WPF is the API for the job. See the WPFSample code for a demonstration. System.Windows namespace The System.Windows namespace contains the Windows Presentation Foundation API. WPF includes many of the "standard" controls that are in WinForms, for example, Button, Label, CheckBox, ComboBox, and so on. However, it also includes APIs to create, animate, and render 3D graphics; render multimedia; draw bitmap and vector graphics; and perform animation. WPF addresses many of the limitations of Windows Forms, but this power comes at a price. WPF introduces a number of novel concepts that developers will need to master, including a new, declarative UI markup called Extensible Application Markup Language (XAML), new event handling, data binding and control theming mechanisms, and a variant of the Model-view-controller (MVC) pattern called Model View ViewModel (MVVM); that said, the use of this pattern is optional but highly recommended. WPF has significantly more moving parts than WinForms, if you ignore the underlying native Windows APIs that WinForm abstracts. Microsoft, though, has gone to some lengths to make the WPF development experience easier for both UI designers and developers. Developers using WPF can choose to design and develop user interfaces using XAML, any of the .NET languages, or most often a combination of the two. Visual Studio and Expression Blend provide rich WYSIWYG designers to create WPF controls and interfaces and hide the complexities of the underlying XAML. Direct tweaking of the XAML is sometimes required for precise adjustments. WPF is now a mature, stable API that has been highly optimized for performance—all of its APIs are GPU accelerated. Though it is probably not as well known as WinForms, it has become relatively well known within the developer community, particularly because Silverlight, which is Microsoft's platform for developing rich web and mobile applications, uses a subset of WPF. Many of the third-party control vendors who produce WinForm controls now also produce equivalent WPF controls. The tools for creating WPF applications, predominantly Visual Studio and Expression Blend, are particularly good, and there are also a number of good third-party and open source tools to work with XAML. The introduction of WinRT and the increasingly powerful capabilities of web browser technologies, including HTML5, CSS3, JavaScript, WebGL, and GPU-acceleration, raise valid questions about the long-term future of WPF and Silverlight. Microsoft seems to be continuing to promote the use of WPF, and even WinRT supports a variant of the XAML markup language, so it should remain a viable API for a while. The following table shows the evaluation of the System.Windows namespace: Maturity Productivity Performance Availability ASP.NET The .NET Framework was originally designed to be Microsoft's first web development platform, and it included APIs to build both web applications and web services. These APIs were, and still are, part of the ASP.NET web development framework that lives in the System.Web namespace. ASP.NET has come a very long way since the first release of .NET, and it is the second most widely used and popular web framework in the world today (see http://trends.builtwith.com/framework). The ASP.NET platform provides a number of complimentary APIs that can be used to develop web applications, including Web Forms, web services, MVC, web pages, Web API, and SignalR. System.Web.Forms namespace System.Web.Mvc namespace System.Web.WebPages namespace System.Web.Services namespace Microsoft.AspNet.SignalR namespace Windows Communication Foundation One of the major selling points of the first release of .NET was that the platform had support for web services baked in, in the form of ASP.NET Web Services. Web Services have come a very long way since SOAP was invented in 1998 and the first release of .NET, and WCF has subsumed the limited capabilities of ASP.NET Web Services with a far richer platform. WCF has also subsumed the original .NET Remoting (System.Runtime.Remoting), MSMQ (System.Messaging), and Enterprise Services (System.EnterpriseServices) APIs. System.ServiceModel namespace The root namespace for WCF is System.ServiceModel. This API includes support for most of the WS-* web services standards and non-HTTP or XML-based services, including MSMQ and TCP services that use binary or Message Transmission Optimization Mechanism (MTOM) message encoding. Address, Binding, and Contract (ABC) of WCF are very well understood by the majority of the developer community, though deep technical knowledge of WCF's inner workings is rarer. The use of attributes to declare service and data contracts and a configuration-over-code approach makes the WCF API highly declarative, and creating sophisticated services that use advanced WS-* capabilities is relatively easy. WCF is very stable and can be used to create high-performance distributed applications. WCF is available on all recent versions of .NET, though not all platforms include the server components of WCF. Partial support for WCF is also available on third-party CLI implementations, such as Mono. REST-based web services, that serve relatively simple XML or JSON, have become very popular, and though WCF fairly recently added support for REST, these capabilities have now evolved into the ASP.NET Web API. The following table shows the evaluation of the System.ServiceModel namespace: Maturity Productivity Performance Availability Windows Workflow Foundation Windows Workflow Foundation (WF) is a workflow framework that was introduced in .NET 3.0, and that brings the power and flexibility of declarative workflow or business process design and execution to .NET applications. System.Activities namespace The System.Activities namespace contains the Windows Workflow Foundation API. WF includes a workflow runtime, a hosting API, a number of basic workflow activities, APIs to create custom activities, and a workflow designer control, which was originally a WinForms control but is now a WPF control as of .NET 4.0. WF also uses a variant of the same XAML markup, which WPF and WinRT use, to represent workflows; that said, an excellent designer, hosted by default in Visual Studio, should mean that you never have to directly modify the XAML. The adoption of the first few versions of the WF API was limited, but WF was completely rewritten for .NET 4.0, and many of the shortcomings of the original version were entirely addressed. WF is now a mature, stable, best-of-breed workflow API, with a proven track record. The previous implementation of WF is still available in current versions of the .NET Framework, for migration and interoperation purposes, and is in the System.Workflow namespace. WF is used by SharePoint Server, Windows Server AppFabric, Windows Azure AppFabric, Office 365, Visual Studio Team Foundation Server (MSBuild), and a number of other Microsoft products and services. Windows Server AppFabric and Windows Azure AppFabric enable a new class of scalable SOA server and cloud application called a Workflow Service, which is a combination of the capabilities of WCF and WF. WF has a relatively small but strong following within the .NET developer community. There are also a number of third-party and open source WF activity libraries and tools available. Though applications composed using workflows typically have poorer performance than those that are implemented entirely in code, the flexibility and significantly increased developer productivity (particularly when it comes to modifying existing processes) that workflows give you are often worth the performance price. That said, Microsoft has made significant investments in optimizing the performance of WF, and it should be more than adequate for most enterprise application scenarios. Though versions of WF are available on other CLI platforms, the availability of WF 4.x is limited to Microsoft platforms and .NET 4.0 and higher. The evaluation of the System.Workflow namespace shown in the following table is for the most recent version of WF (the use of versions of WF prior to 4.0 is not recommended for new applications): Maturity Productivity Performance Availability Summary There is more to the .NET Framework than has been articulated in this primer; it includes many useful APIs that have not even been mentioned here, for example, System.Media, System.Speech, and the Windows Identity Framework. There are also a number of very powerful APIs developed by Microsoft (and Microsoft Research) that are not (yet) officially part of the .NET Framework; for example, Reactive Extensions, Microsoft Solver Foundation, Windows Azure APIs, and the new .NET for Windows Store Apps APIs are worth looking into. Resources for Article:   Further resources on this subject: Content Based Routing on Microsoft Platform [article] Building the Content Based Routing Solution on Microsoft Platform [article] Debatching Bulk Data on Microsoft Platform [article]
Read more
  • 0
  • 0
  • 1434
article-image-configuration
Packt
21 Jul 2014
12 min read
Save for later

Configuration

Packt
21 Jul 2014
12 min read
(For more resources related to this topic, see here.) Configuration targets In this section, we look at the different layers that can be configured. The layers are: SYSTEM: This layer is system-wide and found in /etc/gitconfig GLOBAL: This layer is global for the user and found in ~/.gitconfig LOCAL: This layer is local to the current repository and found in .git/config Getting ready We will use the jgit repository for this example, as shown in the following command: $ git clone https://git.eclipse.org/r/jgit/jgit $ cd jgit How to do it... In the previous example, we saw how we could use the command git config --list to list configuration entries. This list is actually made from three different levels of configuration that Git offers: system-wide configuration, SYSTEM; global configuration for the user, GLOBAL; and local repository configuration, LOCAL. For each of these configuration layers, we can query the existing configuration. On a Windows box with a default installation of the Git extensions, the different configuration layers will look approximately like the following: $ git config --list --system core.symlinks=false core.autocrlf=true color.diff=auto color.status=auto color.branch=auto color.interactive=true pack.packsizelimit=2g help.format=html http.sslcainfo=/bin/curl-ca-bundle.crt sendemail.smtpserver=/bin/msmtp.exe diff.astextplain.textconv=astextplain rebase.autosquash=true $ git config --list --global merge.tool=kdiff3 mergetool.kdiff3.path=C:/Program Files (x86)/KDiff3/kdiff3.exe diff.guitool=kdiff3 difftool.kdiff3.path=C:/Program Files (x86)/KDiff3/kdiff3.exe core.editor="C:/Program Files (x86)/GitExtensions/GitExtensions.exe" fileeditor core.autocrlf=true credential.helper=!"C:/Program Files (x86)/GitExtensions/GitCredentialWinStore/git-credential-winst ore.exe" user.name=Aske Olsson [email protected] $ git config --list --local core.repositoryformatversion=0 core.filemode=false core.bare=false core.logallrefupdates=true core.symlinks=false core.ignorecase=true core.hidedotfiles=dotGitOnly remote.origin.url=https://git.eclipse.org/r/jgit/jgit remote.origin.fetch=+refs/heads/*:refs/remotes/origin/* branch.master.remote=origin branch.master.merge=refs/heads/master We can also query a single key and limit the scope to one of the three layers, by using the following command: $ git config --global user.email [email protected] We can set the e-mail address of the user to a different one for the current repository: $ git config --local user.email [email protected] Now, listing the GLOBAL layer user.email will return [email protected], listing LOCAL gives [email protected], and listing user.email without specifying the layer gives the effective value that is used in the operations on this repository, in this case, the LOCAL value [email protected]. The effective value is the value, which takes precedence when needed. When two or more values are specified for the same key, but on different layers, the lowest layer takes precedence. When a configuration value is needed, Git will first look in the LOCAL configuration. If not found here, the GLOBAL configuration is queried. If it is not found in the GLOBAL configuration, the SYSTEM configuration is used. If none of this works, the default value in Git is used. In the previous example, user.email is specified in both the GLOBAL and LOCAL layers. Hence, the LOCAL layer will be used. How it works... Querying the three layers of configuration simply returns the content of the configuration files: /etc/gitconfig for system-wide configuration, ~/.gitconfig for user-specific configuration, and .git/config for repository-specific configuration. When not specifying the configuration layer, the returned value will be the effective value. There's more... Instead of setting all the configuration values on the command line by the key value, it is possible to set them by just editing the configuration file directly. Open the configuration file in your favorite editor and set the configuration you need, or use the built-in git config -e repository to edit the configuration directly in the Git-configured editor. You can set the editor to the editor of your choice either by changing the $EDITOR environment variable or with the core.editor configuration target, for example: $ git config --global core.editor vim Querying the existing configuration In this example, we will look at how we can query the existing configuration and set the configuration values. Getting ready We'll use jgit again by using the following command: $ cd jgit How to do it... To view all the effective configurations for the current Git repository, run the following command: $ git config --list user.name=Aske Olsson [email protected] core.repositoryformatversion=0 core.filemode=false core.bare=false core.logallrefupdates=true remote.origin.url=https://git.eclipse.org/r/jgit/jgit remote.origin.fetch=+refs/heads/*:refs/remotes/origin/* branch.master.remote=origin branch.master.merge=refs/heads/master The previous output will of course reflect the user running the command. Instead of Aske Olsson as the name and the e-mail, the output should reflect your settings. If we are just interested in a single configuration item, we can just query it by its section.key or section.subsection.key: $ git config user.name Aske Olsson $ git config remote.origin.url https://git.eclipse.org/r/jgit/jgit How it works... Git's configuration is stored in plaintext files, and works like a key-value storage. You can set/query by key and get the value back. An example of the text-based configuration file is shown as follows (from the jgit repository): $ cat .git/config [core] repositoryformatversion = 0 filemode = false bare = false logallrefupdates = true [remote "origin"] url = https://git.eclipse.org/r/jgit/jgit fetch = +refs/heads/*:refs/remotes/origin/* [branch "master"] remote = origin merge = refs/heads/master There's more... It is also easy to set configuration values. Just use the same syntax as when querying the configuration except add an argument to the value. To set a new e-mail address on the LOCAL layer, we can execute the following command line: git config user.email [email protected] The LOCAL layer is the default if nothing else is specified. If you require whitespaces in the value, you can enclose the string in quotation marks, as you would do when configuring your name: git config user.name "Aske Olsson" You can even set your own configuration, which does not have any effect on the core Git, but can be useful for scripting/builds and so on: $ git config my.own.config "Whatever I need" List the value $ git config my.own.config Whatever I need It is also very easy to delete/unset configuration entries: $ git config --unset my.own.config List the value $ git config my.own.config Templates In this example, we will see how to create a template commit message that will be displayed in the editor when creating a commit. The template is only for the local user and not distributed with the repository in general. Getting ready In this example, we will use the example repository: $ git clone https://github.com/dvaske/data-model.git $ cd data-model We'll use the following code as a commit message template for commit messages: Short description of commit Longer explanation of the motivation for the change Fixes-Bug: Enter bug-id or delete line Implements-Requirement: Enter requirement-id or delete line Save the commit message template in $HOME/.gitcommitmsg.txt. The filename isn't fixed and you can choose a filename of your liking. How to do it... To let Git know about our new commit message template, we can set the configuration variable commit.template to point at the file we just created with that template; we'll do it globally so it is applicable to all our repositories: $ git config --global commit.template $HOME/.gitcommitmsg.txt Now, we can try to change a file, add it, and create a commit. This will bring up our preferred editor with the commit message template preloaded: $ git commit Short description of commit Longer explanation of the motivation for the change Fixes-Bug: Enter bug-id or delete line Implements-Requirement: Enter requirement-id or delete line # Please enter the commit message for your changes. Lines starting # with '#' will be ignored, and an empty message aborts the commit. # On branch master # Changes to be committed: # (use "git reset HEAD <file>..." to unstage) # # modified: another-file.txt # ~ ~ ".git/COMMIT_EDITMSG" 13 lines, 396 characters We can now edit the message according to our commit and save to complete the commit. How it works... When commit.template is set, Git simply uses the content of the template file as a starting point for all commit messages. This is quite convenient if you have a commit-message policy as it greatly increases the chances of the policy being followed. You can even have different templates tied to different repositories, since you can just set the configuration at the local level. A .git directory template Sometimes, having a global configuration isn't enough. You will also need to trigger the execution of scripts (aka Git hooks), exclude files, and so on. It is possible to achieve this with the template option set to git init. It can be given as a command-line option to git clone and git init, or as the $GIT_TEMPLATE_DIR environment variable, or as the configuration option init.templatedir. It defaults to /usr/share/git-core/templates. The template option works by copying files in the template directory to the .git ($GIT_DIR) folder after it has been created. The default directory contains sample hooks and some suggested exclude patterns. In the following example, we'll see how we can set up a new template directory, and add a commit message hook and exclude file. Getting ready First, we will create the template directory. We can use any name we want, and we'll use ~/.git_template, as shown in the following command: $ mkdir ~/.git_template Now, we need to populate the directory with some template files. This could be a hook or an exclude file. We will create one hook file and an exclude file. The hook file is located in .git/hooks/name-of-hook and the exclude file in .git/info/exclude. Create the two directories needed hooks and info, as shown in the following command: $ mkdir ~/.git_template/{hooks,info} To keep the sample hooks provided by the default template directory (the Git installation), we copy the files in the default template directory to the new one. When we use our newly created template directory, we'll override the default one. So, copying the default files to our template directory will make sure that except for our specific changes the template directory is similar to the default one, as shown in the following command: $ cd ~/.git_template/hooks $ cp /usr/share/git-core/templates/hooks/* . We'll use the commit-msg hook as the example hook: #!/bin/sh MSG_FILE="$1" echo "nHi from the template commit-msg hook" >> $MSG_FILE The hook is very simple and will just add Hi from the template commit-msg hook to the end of the commit message. Save it as commit-msg in the ~/.git_template/hooks directory and make it executable by using the following command: chmod +x ~/.git_template/hooks/commit-msg Now that the commit message hook is done, let's also add an exclude file to the example. The exclude file works like the .gitignore file, but is not tracked in the repository. We'll create an exclude file that excludes all the *.txt files, as follows: $ echo *.txt > ~/.git_template/info/exclude Now, our template directory is ready for use. How to do it... Our template directory is ready and we can use it, as described earlier, as a command-line option, an environment variable or, as in this example, to be set as a configuration: $ git config --global init.templatedir ~/.git_template Now, all Git repositories we create using init or clone will have the default files of the template directory. We can test if it works by creating a new repository as follows: $ git init template-example $ cd template-example Let's try to create a .txt file and see what git status tells us. It should be ignored by the exclude file from the template directory: $ echo "this is the readme file" > README.txt $ git status The exclude file worked! You can put in the file endings yourself or just leave it blank and keep to the .gitignore files. To test if the commit-msg hook also works, let us try to create a commit. First, we need a file to commit. So, let's create that and commit it as follows: $ echo "something to commit" > somefile $ git add somefile $ git commit –m "Committed something" We can now check the history with git log: $ git log -1 commit 1f7d63d7e08e96dda3da63eadc17f35132d24064 Author: Aske Olsson <[email protected]> Date: Mon Jan 6 20:14:21 2014 +0100 Committed something Hi from the template commit-msg hook How it works... When Git creates a new repository, either via init or clone, it will copy the files from the template directory to the new repository when creating the directory structure. The template directory can be defined either by a command-line argument, environment variable, or configuration option. If nothing is specified, the default template directory will be used (distributed with the Git installation). By setting the configuration as a --global option, the template directory defined will apply to all of the user's (new) repositories. This is a very nice way to distribute the same hooks across repositories, but it also has some drawbacks. As the files in the template directory are only copied to the Git repositories, updates to the template directory do not affect the existing repositories. This can be solved by running git init in each existing repository to reinitialize the repository, but this can be quite cumbersome. Also, the template directory can enforce hooks on some repositories where you don't want them. This is quite easily solved by simply deleting the hook files in .git/hooks of that repository.
Read more
  • 0
  • 0
  • 1468

article-image-article-design-patterns
Packt
21 Jul 2014
5 min read
Save for later

Design patterns

Packt
21 Jul 2014
5 min read
(For more resources related to this topic, see here.) Design patterns are ways to solve a problem and the way to get your intended result in the best possible manner. So, design patterns are not only ways to create a large and robust system, but they also provide great architectures in a friendly manner. In software engineering, a design pattern is a general repeatable and optimized solution to a commonly occurring problem within a given context in software design. It is a description or template for how to solve a problem, and the solution can be used in different instances. The following are some of the benefits of using design patterns: Maintenance Documentation Readability Ease in finding appropriate objects Ease in determining object granularity Ease in specifying object interfaces Ease in implementing even for large software projects Implements the code reusability concept If you are not familiar with design patterns, the best way to begin understanding is observing the solutions we use for commonly occurring, everyday life problems. Let's take a look at the following image: Many different types of power plugs exist in the world. So, we need a solution that is reusable, optimized, and cheaper than buying a new device for different power plug types. In simple words, we need an adapter. Have a look at the following image of an adapter: In this case, an adapter is the best solution that's reusable, optimized, and cheap. But an adapter does not provide us with a solution when our car's wheel blows out. In object-oriented languages, we the programmers use the objects to do whatever we want to have the outcome we desire. Hence, we have many types of objects, situations, and problems. That means we need more than just one approach to solving different kinds of problems. Elements of design patterns The following are the elements of design patterns: Name: This is a handle we can use to describe the problem Problem: This describes when to apply the pattern Solution: This describes the elements, relationships, responsibilities, and collaborations, in a way that we follow to solve a problem Consequences: This details the results and trade-offs of applying the pattern Classification of design patterns Design patterns are generally divided into three fundamental groups: Creational patterns Structural patterns Behavioral patterns Let's examine these in the following subsections. Creational patterns Creational patterns are a subset of design patterns in the field of software development; they serve to create objects. They decouple the design of an object from its representation. Object creation is encapsulated and outsourced (for example, in a factory) to keep the context of object creation independent from concrete implementation. This is in accordance with the rule: "Program on the interface, not the implementation." Some of the features of creational patterns are as follows: Generic instantiation: This allows objects to be created in a system without having to identify a specific class type in code (Abstract Factory and Factory pattern) Simplicity: Some of the patterns make object creation easier, so callers will not have to write large, complex code to instantiate an object (Builder (Manager) and Prototype pattern) Creation constraints: Creational patterns can put bounds on who can create objects, how they are created, and when they are created The following patterns are called creational patterns: The Abstract Factory pattern The Factory pattern The Builder (Manager) pattern The Prototype pattern The Singleton pattern Structural patterns In software engineering, design patterns structure patterns facilitate easy ways for communications between various entities. Some of the examples of structures of the samples are as follows: Composition: This composes objects into a tree structure (whole hierarchies). Composition allows customers to be uniformly treated as individual objects according to their composition. Decorator: This dynamically adds options to an object. A Decorator is a flexible alternative embodiment to extend functionality. Flies: This is a share of small objects (objects without conditions) that prevent overproduction. Adapter: This converts the interface of a class into another interface that the clients expect. Adapter lets those classes work together that would normally not be able to because of the different interfaces. Facade: This provides a unified interface meeting the various interfaces of a subsystem. Facade defines a higher-level interface to the subsystem, which is easier to use. Proxy: This implements the replacement (surrogate) of another object that controls access to the original object. Bridge: This separates an abstraction from its implementation, which can then be independently altered. Behavioral patterns Behavioral patterns are all about a class' objects' communication. Behavioral patterns are those patterns that are most specifically concerned with communication between objects. The following is a list of the behavioral patterns: Chain of Responsibility pattern Command pattern Interpreter pattern Iterator pattern Mediator pattern Memento pattern Observer pattern State pattern Strategy pattern Template pattern Visitor pattern If you want to check out the usage of some patterns in the Laravel core, have a look at the following list: The Builder (Manager) pattern: IlluminateAuthAuthManager and IlluminateSessionSessionManager The Factory pattern: IlluminateDatabaseDatabaseManager and IlluminateValidationFactory The Repository pattern: IlluminateConfigRepository and IlluminateCacheRepository The Strategy pattern: IIlluminateCacheStoreInterface and IlluminateConfigLoaderInterface The Provider pattern: IIlluminateAuthAuthServiceProvider and IlluminateHashHashServiceProvider Summary In this article, we have explained the fundamentals of design patterns. We've also introduced some design patterns that are used in the Laravel Framework. Resources for Article: Further resources on this subject: Laravel 4 - Creating a Simple CRUD Application in Hours [article] Your First Application [article] Creating and Using Composer Packages [article]
Read more
  • 0
  • 0
  • 1378