Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

How-To Tutorials - Data

1204 Articles
article-image-openam-backup-recovery-and-logging
Packt
03 Feb 2011
9 min read
Save for later

OpenAM: Backup, Recovery, and Logging

Packt
03 Feb 2011
9 min read
  OpenAM Written and tested with OpenAM Snapshot 9—the Single Sign-On (SSO) tool for securing your web applications in a fast and easy way The first and the only book that focuses on implementing Single Sign-On using OpenAM Learn how to use OpenAM quickly and efficiently to protect your web applications with the help of this easy-to-grasp guide Written by Indira Thangasamy, core team member of the OpenSSO project from which OpenAM is derived Real-world examples for integrating OpenAM with various applications OpenSSO provides utilities that can be invoked with proper procedure to backup the server configuration data. When a crash or data corruption occurs, the server administrator must initiate a recovery operation. Recovering a backup involves the following two distinct operations: Restoring the configuration files Restoring the XML configuration data that was backed up earlier In general, recovery refers to the various operations involved in restoring, rolling forward, and rolling back a backup. Backup and recovery refers to the various strategies and operations involved in protecting the configuration database against data loss, and reconstructing the database should a loss occur. In this article, you should be able to learn about how to use the tools provided by OpenSSO to perform the following: OpenSSO configuration backup and recovery Test to production Trace and debug level logging for troubleshooting Audit log configuration using flat file Audit log configuration using RDBMS Securing the audit logs from intrusion OpenSSO deals with only backing up the configuration data of the server as the identity such as users, groups, and roles data backup and recovery will be handled by the enterprise level identity management suite of products. Backing up configuration data I am sure you are familiar now with the different configuration stores supported by the OpenSSO, an embedded store (based on OpenDS), and a highly scalable Directory Server Enterprise Edition. Regardless of the underlying configuration type, there are certain files that are created in the local file system on the host where the server is deployed. These files (such as bootstrap file) contain critical pieces of information that helps the application to initialize, any corruption in these files could cause the server application not to start. Hence it becomes necessary to backup the configuration data stored in the file system. As a result, we could term the backup and recovery as a two step process: Backup the configuration files in the local file system Backup the OpenSSO configuration data in the LDAP configuration Let us briefly discuss what each option means and when to apply them. Backing up the OpenSSO configuration files Typically the server can fail to come up for two reasons. Either because it could not find the right configuration file that will locate the configuration data store or because the data contained in the configuration store is corrupted. It is the simplest case I took up for this discussion because there could be umpteen reasons that could cause a server to fail to start up. OpenSSO provides a subcommand export-svc-cfg as part of the ssoadm command line interface. Using this the customer can only backup the configuration data that is stored in the configuration store, provided the configuration store is up and running. This backup will not help if the disk that holds the configuration files such as the bootstrap and OpenDS schema files crashed, because the backup obtained using the export-svc-cfg will not contain these schema and bootstrap files. This jeopardizes the backup and recovery process for the OpenSSO. This is why backing up the configuration directory becomes inevitable and vital for the recovery process. To backup the OpenSSO configuration directory, you can just use any file archive tool of your choice. To perform this backup, log on to the OpenSSO host as the OpenSSO configuration user, and execute the following command: zip -r /tmp/opensso_config.zip /export/ssouser/opensso1/ This will backup all the contents of the configuration directory (in this case /export/ssouser/opensso1). Though this will be the recommended way to backup, there may be server audit and debug logs that could fill up the disk space as you perform a periodic backup of this directory. The critical files and directories that need to be backed up are as follows: bootstrap opends (whole directory if present) .version .configParam certificate stores The rest of the content of this directory can be restored from your staging area. If you have customized any of the service schema under the <opensso-config>/config/xml directory, then make sure you back them up. This backup is in itself enough to bring up the corrupted OpenSSO server. When you backup the opends directory all the OpenSSO configuration information will also get backed up, so you really do not need to have the backup file that you would generate using the export-svc-cfg. This kind of backup will be extremely useful and is the only way to perform the crash recovery. If the OpenDS is itself corrupted due to its internal database or index corruption, it will not start. Hence one cannot access the OpenSSO server or ssoadm command line tool to restore the XML backup. So, it is a must to backup your configuration directory from the file system periodically. Backing up the OpenSSO configuration data The process of backing up the OpenSSO service configuration data is slightly different from the complete backup of the overall system deployment. When the subcommand export-svc-cfg is invoked, the underlying code exports all the nodes under the ou=services,ROOT_SUFFIX of the configuration directory server: ./ssoadm export-svc-cfg -u amadmin -f /tmp/passwd_of_amadmin -e secretkeytoencryptpassword -o /tmp/svc-config-bkup.xml To perform this, you need to have the ssoadm command line tool configured. The options supplied to this command are self-explanatory except maybe the -e . This takes a random string that will be used as the encryption secret to encrypt the password entries in the service configuration data. For example, the RADIUS server's share secret value. You need this key to restore the data back to the server. The OpenSSO and its configuration directory server must be running in good condition in order to be successful with this export operation. This backup will be useful in the following cases: Administrator accidentally deleted some of the authentication configurations Administrator accidentally changed some of the configuration properties Somehow the agent profiles have lost their configuration data Want to reset to factory defaults In any case, one should be able to authenticate to OpenSSO as an admin to restore the configuration data. If the server is not in that state, then crash recovery is the only option. In the embedded store configuration case this means unzipping the file system configuration backup obtained as described in the Backing up the OpenSSO configuration files section. For the configuration data that is stored in the Directory Server Enterprise Edition, the customer should use the tools that are bundled with the Oracle Directory Server Enterprise Edition to backup and restore. Crash recovery and restore In the previous section, we briefly covered the crash recovery part of it. When a crash occurs in the embedded or remote configuration store, the server will not come up again unless it is restored back to a valid state. This may involve restoring the proper database state and indexes using a known valid state backup. This backup may have been obtained by using the ODSEE backup tools or simply zipping up the configuration file system of OpenSSO, as described in the Backing up the OpenSSO configuration files section. You need to bring back the OpenSSO server to a state where the administrator can log in to access the console. At this point the configuration exported to XML, as described in the Backing up the OpenSSO configuration data section can be used. Here is a sample execution of the import-svccfg subcommand. It is recommended to backup your vanilla configuration data from the file system periodically to use it in the crash recovery case (where the embedded store itself is corrupted). Backup of configuration data using the export-svc-cfg should be done frequently: ./ssoadm import-svc-cfg -u amadmin -f /tmp/passwd_of_amadmin -e mysecretenckey -X /tmp/svc-config-bkup.xml This will throw an error (because we have intentionally provided a wrong key) claiming that the secret key provided was wrong (actually it will show a string such as the following, that is a known bug): import-service-configuration-secret-key This is the key name that is supposed to contain a corresponding localizable error string. If you provide the correct encryption key, then it will import successfully: ./ssoadm import-svc-cfg -u amadmin -f /tmp/passwd_of_amadmin -e secretkeytoencryptpassword -X /tmp/svc-config-bkup.xml Directory Service contains existing data. Do you want to delete it? [y|N] y Please wait while we import the service configuration... Service Configuration was imported. Note that it prompts before overwriting the existing data to make sure that the current configuration is not overwritten accidentally. There is no incremental restore so be cautious while performing this operation. An import with a wrong version of the restore file could damage a working configuration. It is always recommended to backup the existing configuration before importing an existing configuration backup file. If you do not want to import the current file just enter N and the command will terminate without harming your data. Well, what happens if customers do not have the configuration files backed up? Suppose customers do not have the copy of the configuration files to restore, they can reconfigure the OpenSSO web application by accessing the configurator (after cleaning up existing configuration). Once they configure the server, they should be able to restore the XML backup. Nevertheless, the new configuration must match all the configuration parameters that were provided earlier including the hostname and port details. This information can be found in the .configParam file. If you are planning to export the configuration to a different server than the original server, then you should be referring to the Test to production section, that covers the details on how customers can migrate the test configuration to a production server. It requires more steps than simply restoring the configuration data.
Read more
  • 0
  • 0
  • 2513

article-image-creating-time-series-charts-r
Packt
01 Feb 2011
5 min read
Save for later

Creating Time Series Charts in R

Packt
01 Feb 2011
5 min read
Formatting time series data for plotting Time series or trend charts are the most common form of line graphs. There are a lot of ways in R to plot such data, however it is important to first format the data in a suitable format that R can understand. In this recipe, we will look at some ways of formatting time series data using the base and some additional packages. Getting ready In addition to the basic R functions, we will also be using the zoo package in this recipe. So first we need to install it: install.packages("zoo") How to do it... Let's use the dailysales.csv example dataset and format its date column: sales<-read.csv("dailysales.csv") d1<-as.Date(sales$date,"%d/%m/%y") d2<-strptime(sales$date,"%d/%m/%y") data.class(d1) [1] "Date" data.class(d2) [1] "POSIXt" How it works... We have seen two different functions to convert a character vector into dates. If we did not convert the date column, R would not automatically recognize the values in the column as dates. Instead, the column would be treated as a character vector or a factor. The as.Date() function takes at least two arguments: the character vector to be converted to dates and the format to which we want it converted. It returns an object of the Date class, represented as the number of days since 1970-01-01, with negative values for earlier dates. The values in the date column are in a DD/MM/YYYY format (you can verify this by typing sales$date at the R prompt). So, we specify the format argument as "%d/%m/%y". Please note that this order is important. If we instead use "%m/%d/%y", then our days will be read as months and vice-versa. The quotes around the value are also necessary. The strptime() function is another way to convert character vectors into dates. However, strptime() returns a different kind of object of class POSIXlt, which is a named list of vectors representing the different components of a date and time, such as year, month, day, hour, seconds, minutes, and a few more. POSIXlt is one of the two basic classes of date/times in R. The other class POSIXct represents the (signed) number of seconds since the beginning of 1970 (in the UTC time zone) as a numeric vector. POSIXct is more convenient for including in data frames, and POSIXlt is closer to human readable forms. A virtual class POSIXt inherits from both of the classes. That's why when we ran the data.class() function on d2 earlier, we get POSIXt as the result. strptime() also takes a character vector to be converted and the format as arguments. There's more... The zoo package is handy for dealing with time series data. The zoo() function takes an argument x, which can be a numeric vector, matrix, or factor. It also takes an order.by argument which has to be an index vector with unique entries by which the observations in x are ordered: library(zoo) d3<-zoo(sales$units,as.Date(sales$date,"%d/%m/%y")) data.class(d3) [1] "zoo" See the help on DateTimeClasses to find out more details about the ways dates can be represented in R. Plotting date and time on the X axis In this recipe, we will learn how to plot formatted date or time values on the X axis. Getting ready For the first example, we only need to use the base graphics function plot(). How to do it... We will use the dailysales.csv example dataset to plot the number of units of a product sold daily in a month: sales<-read.csv("dailysales.csv") plot(sales$units~as.Date(sales$date,"%d/%m/%y"),type="l", xlab="Date",ylab="Units Sold") How it works... Once we have formatted the series of dates using as.Date(), we can simply pass it to the plot() function as the x variable in either the plot(x,y) or plot(y~x) format. We can also use strptime() instead of using as.Date(). However, we cannot pass the object returned by strptime() to plot() in the plot(y~x) format. We must use the plot(x,y) format as follows: plot(strptime(sales$date,"%d/%m/%Y"),sales$units,type="l", xlab="Date",ylab="Units Sold") There's more... We can plot the example using the zoo() function as follows (assuming zoo is already installed): library(zoo) plot(zoo(sales$units,as.Date(sales$date,"%d/%m/%y"))) Note that we don't need to specify x and y separately when plotting using zoo; we can just pass the object returned by zoo() to plot(). We also need not specify the type as "l". Let's look at another example which has full date and time values on the X axis, instead of just dates. We will use the openair.csv example dataset for this example: air<-read.csv("openair.csv") plot(air$nox~as.Date(air$date,"%d/%m/%Y %H:%M"),type="l", xlab="Time", ylab="Concentration (ppb)", main="Time trend of Oxides of Nitrogen") (Move the mouse over the image to enlarge it.) The same graph can be made using zoo as follows: plot(zoo(air$nox,as.Date(air$date,"%d/%m/%Y %H:%M")), xlab="Time", ylab="Concentration (ppb)", main="Time trend of Oxides of Nitrogen")
Read more
  • 0
  • 0
  • 4428

article-image-faqs-microsoft-sql-server-2008-high-availability
Packt
28 Jan 2011
6 min read
Save for later

FAQs on Microsoft SQL Server 2008 High Availability

Packt
28 Jan 2011
6 min read
Microsoft SQL Server 2008 High Availability Minimize downtime, speed up recovery, and achieve the highest level of availability and reliability for SQL server applications by mastering the concepts of database mirroring,log shipping,clustering, and replication  Install various SQL Server High Availability options in a step-by-step manner  A guide to SQL Server High Availability for DBA aspirants, proficient developers and system administrators  Learn the pre and post installation concepts and common issues you come across while working on SQL Server High Availability  Tips to enhance performance with SQL Server High Availability  External references for further study Q: What is Clustering? A: Clustering is usually deployed when there is a critical business application running that needs to be available 24 X 7 or in terminology—High Availability. These clusters are known as Failover clusters because the primary goal to set up the cluster is to make services or business processes that are critical for business and should be available 24 X 7 with 99.99% up time. Q: How does MS Windows server Enterprise and Datacenter edition support failover clustering? A: MS Windows server Enterprise and Datacenter edition supports failover clustering. This is achieved by having two or more identical nodes connected to each other by means of private network and commonly used resources. In case of failure of any common resource or services, the first node (Active) passes the ownership to another node (Passive). Q: What is MSDTC? A: Microsoft Distributed Transaction Coordinator (MSDTC) is a service used by the SQL Server when it is required to have distributed transactions between more than one machine. In a clustered environment, SQL Server service can be hosted on any of the available nodes if the active node fails, and in this case MSDTC comes into the picture in case we have distributed queries and for replication, and hence the MSDTC service should be running. Following are a couple of questions with regard to MSDTC. Q: What will happen to the data that is being accessed? A: The data is taken care of, by shared disk arrays as it is shared and every node that is part of the cluster can access it; however, one node at a time can access and own it. Q: What about clients that were connected previously? Does the failover mean that developers will have to modify the connection string? A: Nothing like this happens. SQL Server is installed as a virtual server and it has a virtual IP address and that too is shared by every cluster node. So, the client actually knows only one SQL Server or its IP address. Here are the steps that explain how Failover will work: Node 1 owns the resources as of now, and is active node. The network adapter driver gets corrupted or suffers a physical damage. Heartbeat between Node1 and Node 2 is broken. Node 2 initiates the process to take ownership of the resources owned by the Node 1. It would approximately take two to five minutes to complete the process. Q: What is Hyper-V? What are their uses? A: Let's see what the Hyper-V is: It is a hypervisor-based technology that allows multiple operating systems to run on a host operating system at the same time. It has advantages of using SQL Server 2008 R2 on Windows Server 2008 R2 with Hyper-V. One such example could be the ability to migrate a live server, thereby increasing high availability without incurring downtime, among others. Hyper-V now supports up to 64 logical processors. It can host up to four VMs on a single licensed host server. SQL Server 2008 R2 allows an unrestricted number of virtual servers, thus making consolidation easy. It has the ability to manage multiple SQL Servers centrally using Utility Control Point (UCP). Sysprep utility can be used to create preconfigured VMs so that SQL Server deployment becomes easier. Q: What are the Hardware, Software and Operating system requirements for installing SQL Server 2008 R2? A: The following are the hardware requirements: Processor: Intel Pentium 3 or Higher Processor Speed: 1 GHZ or Higher RAM: 512 MB of RAM but 2 GB is recommended Display : VGA or Higher The following are the software requirements: Operating system: Windows 7 Ultimate, Windows Server 2003 (x86 or x64), Windows Server 2008 (x86 or x64) Disk space: Minimum 1 GB .Net Framework 3.5 Windows Installer 4.5 or later MDAC 2.8 SP1 or later The following are the operating system requirements for clustering: To install SQL Server 2008 clustering, it's essential to have Windows Server 2008 Enterprise or Data Center Edition installed on our host system with Full Installation, so that we don't have to go back and forth and install the required components and restart the system. Q: What is to be done when we see the network binding warning coming up? A: In this scenario, we will have to go to Network and Sharing Center | Change Adapter Settings. Once there, pressing Alt + F, we will select Advanced Settings. Select Public Network and move it up if it is not and repeat this process on the second node. Q: What is the difference between Active/Passive and Active/Active Failover Cluster? A: In reality, there is only one difference between Single-instance (Active/Passive Failover Cluster) and Multi-instance (Active/Active Failover Cluster). As its name suggests, in a Multi-instance cluster, there will be two or more SQL Server active instances running in a cluster, compared to one instance running in Single-instance. Also, to configure a multi-instance Cluster, we may need to procure additional disks, IP addresses, and network names for the SQL Server. Q: What is the benefit of having Multi-instance, that is, Active/Active configuration? A: Depending on the business requirement and the capability of our hardware, we may have one or more instances running in our cluster environment. The main goal is to have a better uptime and better High Availability by having multiple SQL Server instances running in an environment. Should anything go wrong with the one SQL Server instance, another instance can easily take over the control and keep the business-critical application up and running! Q: What will be the difference in the prerequisites for the Multi-instance Failover Cluster as compared to the Single-instance Failover Cluster? A: There will be no difference compared to a Single-instance Failover Cluster, except that we need to procure additional disk(s), network name, and IP addresses. We need to make sure that our hardware is capable of handling requests that come from client machines for both the instances. Installing a Multi-instance cluster is almost similar to adding a Single-instance cluster, except for the need to add a few resources along with a couple of steps here and there.
Read more
  • 0
  • 0
  • 1279
Visually different images

article-image-openam-identity-stores-types-supported-types-caching-and-notification
Packt
27 Jan 2011
11 min read
Save for later

OpenAM Identity Stores: Types, Supported Types, Caching and Notification

Packt
27 Jan 2011
11 min read
Like any other service, the Identity Repository service is also defined using an XML file named idRepoService.xml that can be found in <conf-dir>/config/xml. In this file one can define as many subschema as needed. By default, the following subschema names are defined: LDAPv3 LDAPv3ForAMDS LDAPv3ForOpenDS LDAPv3ForTivoli LDAPv3ForAD LDAPv3ForADAM Files Database However, not all of them are supported in the version that has been tested while writing this article. For instance the files, LDAPv3, and Database subschema are meant to be sample implementations. One can extend it for other databases, keeping this as an example. The rest of the sub configurations are all well tested and supported. One of the Identity Repository types Access Manager Repository is missing from this definition, as it is a manual process to add it into the OpenSSO server. That is something which will be detailed later in this article. It is also called a legacy SDK plugin for OpenSSO. The Identity Repository framework requires support for logging service and session management to deliver its overall functionality. Identity store types In OpenSSO, multiple types of Identity Repository plugins are implemented including the following: LDAPv3Repo AgentsRepo InternalRepo/SpecialRepo FilesRepo AMSDKRepo Unlike the Access Manager Repository plugin, these are available in a vanilla OpenSSO server. So customers can readily use it without requiring to perform any additional configuration steps. LDAPv3Repo: Is the plugin that will be used by customers and the administrators quite frequently as the other types of plugin implementations are mostly meant to be used by OpenSSO internal services. This plugin forms the basis for building the configuration for supporting various LDAP servers including Microsoft Active Directory, Active Directory Application Mode (ADAM/LDS), IBM Tivoli Directory, OpenDS, and Oracle Directory Server Enterprise Edition. There are subschema defined for each of the recently mentioned LDAP servers in the IdRepo service schema as described in the beginning of this section. AgentsRepo: Is a plugin that is used to manage the OpenSSO policy agents' profiles. Unlike the LDAPv3Repo, AgentsRepo uses the configuration repository to store the agent's configuration data including authentication information. Prior to the Agents 3.0 version, all agents accessing earlier versions of OpenSSO such as Access Manager 7.1, had most of the configuration data of the agents stored locally in the file system as plain text files. This imposed huge management problems for the customers to upgrade or change any configuration parameters as it required them to log in to each host where the agents are installed. Besides, the configuration of all agents prior to 3.0 was stored in the user identity store. In OpenSSO the agent's profiles and configurations are stored as part of the configuration Directory Information Tree (DIT). The AgentsRepo is a hidden internal repository plugin, and at no point should it be visible to end users or administrators for modification. SpecialRepo: In the predecessor of OpenSSO the administrative users were stored as part of the user identity store. So even when the configuration store is up and running administrators still cannot log in to the system unless the user identity store is up and running. This kind of limits the customer experience especially during pilot testing and troubleshooting scenarios. To overcome this, OpenSSO introduced a feature wherein all the core administrative users are stored as part of the configuration store in the IdRepo service. All the administrative and special user authentication by default uses this specialrepo framework. It may be possible to override this behavior by invoking module based authentication. SpecialRepo is used as a fallback repository to get authenticated to the OpenSSO server. SpecialRepo is also a hidden internal repository plugin. At no point, should it be visible to end users or administrators for modification. FilesRepo: Is no longer supported in the OpenSSO product. You can see the references of this in the source code but it cannot be configured to use flat files store for either configuration data or user identity data. AMSDKRepo: This plugin has been made available to maintain the compatibility with the Sun Java System Access Manager versions. When this plugin is enabled the identity schema is defined using the DAI service as described in the ums.xml. This plugin will not be available in the vanilla OpenSSO server, the administrator has to perform certain manual steps to have this plugin available for use. In this plugin, identity management is tightly coupled with the Oracle Directory Server Enterprise Edition. It is generally useful in the co-existence scenario where OpenSSO needs to co-exist with Sun Access Manager. In this article wherever we refer to "Access Manager Repository plugin" it means refer to AMSDKRepo. Besides this there is a sample implementation for the MySQL-based database repository available as part of the server default configuration. It works; however, it is not extensively tested for all the OpenSSO features. You can also refer to another discussion on the custom database repository implementation at this link: http:// www.badgers-in-foil.co.uk/notes/installing_a_custom_opensso_identity_ repository/. Caching and notification For the LDAPv3Repo, the key feature that enables it to perform and scale is the caching of results set for each client query, without which it would be impossible to achieve the performance and scalability. When caching is employed there is a possibility that clients could get stale information about identities. This can be avoided by keeping the cache cleaned up periodically or having an external event dirty the cache so new values can be cached. OpenSSO provides more than one way to tackle this caching and notification. There are a couple of ways in which the cache can be invalidated and refreshed. The Identity Repository design relies broadly on two types of mechanisms to refresh the IdRepo cache. They are: Persistent search-based event notification Time-to-live (TTL) based refresh Both methods have their own merits and can be enabled simultaneously, and it is recommended. This is to handle the scenario where a network glitch (which could cause a packet loss) might have caused the OpenSSO server to miss some change notifications. The value of TTL purely depends on the deployment environment and end user experience. Persistent search-based notification The OpenSSO Identity Repository plugin cache can be invalidated and refreshed by registering a persistent search connection to the backend LDAP server provided the LDAP server supports the persistent search control. The persistent search (http:// www.mozilla.org/directory/ietf-docs/draft-smith-psearch-ldap-01.txt) control 2.16.840.1.113730.3.4.3 is implemented by many of the commercial LDAP servers including: IBM (Tivoli Directory) Novell (eDirectory) Oracle Directory Server Enterprise Edition(ODSEE) OpenDS (OpenDS Directory Server 1.0.0-build007) Fedora-Directory/1.0.4 B2006.312.1539 In order to determine whether your LDAP vendor supports a persistent search, perform the following search for the persistent search control 2.16.840.1.113730.3.4.3: ldapsearch -p 389 -h ds_host -s base -b '' "objectclass=*" supportedControl | grep 2.16.840.1.113730.3.4.3 Microsoft Active Directory implements in a different form using the LDAP control 1.2.840.113556.1.4.528. Persistent searches are handled by the max-psearch-count property in the Sun Java Directory Server that defines the maximum number of persistent searches that can be performed on the directory server. The persistent search mechanism provides an active channel through which entries that change (and information about the changes that occur) can be communicated. As each persistent search operation uses one thread, limiting the number of simultaneous persistent searches prevents certain kinds of denial of service attacks. It is quite apparent that a client implementation that generates a large number of persistent connections to a single directory server may indicate that the LDAP protocol may not have been the correct transport. However, horizontal scaling using Directory Proxy Servers, or an LDAP Consumer tier, may assist to spread the load. The best solution, from an LDAP implementation, would be to limit persistent searches. If you have created a user data store against an LDAP server which supports RFC2026, then a persistent search connection will be created with base DN configured in the LDAPv3 configuration. The search filter for this connection is obtained from the data store configuration properties. Though it is possible to listen to a specific type of change event, OpenSSO registers the persistent search connections to receive all kinds of change events. The IdRepo framework has the logic to determine whether the underlying directory server supports persistent searches or not. If not supported it does not try to submit the persistent search. In this case customers may resort to a TTL-based notification as described in the next section. Each active persistent search request requires that an open TCP connection be maintained between an LDAP client (in this case it is OpenSSO) and an LDAP (backend user store LDAP server) server that might not otherwise be kept open. The OpenSSO server that acts as an LDAP client closes idle LDAP connections to the backend LDAP server in order to maximize the resource utilization. If the OpenSSO servers are behind the load balancer or a firewall you need to tune the value of "com. sun.am.event.connection.idle.timeout". If the persistent search connections are made through a Load Balancer (LB) or firewall, then these connections are subject to the TCP timeout value of the respective LB and/or firewall. In such a scenario once the firewall closes the persistent search connection due to an idle TCP timeout, then the change notifications cannot happen to OpenSSO unless the persistent search connection is re-established. Customers could avoid this scenario by configuring the idle timeout for the persistent search connection so that it would restart the persistent search TCP connection before the LB/firewall idle timeout, that way the LB/firewall will not have an idle persistent search connection. The advanced server configuration property "com.sun.am.event.connection. idle.timeout" specifies timeout value in minutes after which the persistent searches will be restarted. Ideally, this value should be lower than the LB/firewall TCP timeout, to make sure that the persistent searches are restarted before the connections are dropped. A value of "0" indicates that these searches will not be restarted. By default the value is "0". Only the connections that are timed out will be reset. You should never set this value to a value lower than the LB/firewall timeout. The delta should not be more than five minutes. If your LB's idle connection timeout is "50" minutes, then set this property value to "45" minutes. For some reason if you want to disable the persistent search to be submitted to the backend LDAP server, just leave the persistent search base (sun-idrepo-ldapv3- config-psearchbase) empty, this will cause the IdRepo to disable the persistent search connection. Time-to-live based notification There may be deployment scenarios where persistent search-based notifications may not be possible or the underlying LDAP server may not be supporting the persistent search control. In such scenarios customers can employ the TTL or timeto- live based notification mechanism. It is a feature that involves a proprietary implementation by the OpenSSO server. This feature works in a fashion that is similar to the polling mechanism in the OpenSSO clients where the client periodically polls the OpenSSO server for changes, often called "pull" model. Whereas persistent search-based notifications are termed as "push" model (the LDAP server pushes the changes to the clients). Regardless of the persistent search based change notifications, the OpenSSO server polls the underlying directory server and gets the data to refresh its Identity Repository cache. TTL-specific properties for Identity Repository cache When the OpenSSO deployment is configured for TTL-based cache refresh, there are certain server-side properties that need to be configured to enable the Identity Repository framework to refresh the cache. The following are the core properties that are relevant in the TTL context: com.sun.identity.idm.cache.entry.expire.enabled=true com.sun.identity.idm.cache.entry.user.expire.time=1 (in minutes). com.sun.identity.idm.cache.entry.default.expire.time=1 (in minutes). The property com.sun.identity.idm.cache.user.expire.time and com.sun. identity.idm.cache.default.expire.time specify time in minutes for which the user and non-user entries such as roles and groups respectively remain valid after their last modification. In other words after this specified period of time elapses, the data for the entry that is cached will expire. At that instant, new requests for these entries will result in fresh reading from the underlying Identity Repository plugins. Suppose the property com.sun.identity.idm.cache.entry.expire.enabled is set to true, the non-user objects cache entries will expire based on the time specified in the com.sun.identity.idm.cache.entry.default.expire.time property. The rest of the user entries objects will be cleaned up based on the value set in the property com.sun.identity.idm.cache.entry.user.expire.time.
Read more
  • 0
  • 0
  • 2415

article-image-choosing-styles-various-graph-elements-r
Packt
25 Jan 2011
4 min read
Save for later

Choosing Styles of Various Graph Elements in R

Packt
25 Jan 2011
4 min read
  R Graph Cookbook Detailed hands-on recipes for creating the most useful types of graphs in R – starting from the simplest versions to more advanced applications Learn to draw any type of graph or visual data representation in R Filled with practical tips and techniques for creating any type of graph you need; not just theoretical explanations All examples are accompanied with the corresponding graph images, so you know what the results look like Each recipe is independent and contains the complete explanation and code to perform the task as efficiently as possible    Choosing plotting point symbol styles and sizes In this recipe, we will see how we can adjust the styling of plotting symbols, which is useful and necessary when we plot more than one set of points representing different groups of data on the same graph. Getting ready All you need to try out this recipe is to run R and type the recipe at the command prompt. You can also choose to save the recipe as a script so that you can use it again later on. We will also use the cityrain.csv example data file. Please read the file into R as follows: rain<-read.csv("cityrain.csv") The code file can be downloaded from here. How to do it... The plotting symbol and size can be set using the pch and cex arguments: plot(rnorm(100),pch=19,cex=2)   How it works... The pch argument stands for plotting character (symbol). It can take numerical values (usually between 0 and 25) as well as single character values. Each numerical value represents a different symbol. For example, 1 represents circles, 2 represents triangles, 3 represents plus signs, and so on. If we set the value of pch to a character such as "*" or "£" in inverted commas, then the data points are drawn as that character instead of the default circles. The size of the plotting symbol is controlled by the cex argument, which takes numerical values starting at 0 giving the amount by which plotting symbols should be magnified relative to the default. Note that cex takes relative values (the default is 1). So, the absolute size may vary depending on the defaults of the graphic device in use. For example, the size of plotting symbols with the same cex value may be different for a graph saved as a PNG file versus a graph saved as a PDF. There’s more... The most common use of pch and cex is when we don’t want to use color to distinguish between different groups of data points. This is often the case in scientific journals which do not accept color images. For example, let’s plot the city rainfall data as a set of points instead of lines: plot(rain$Tokyo, ylim=c(0,250), main="Monthly Rainfall in major cities", xlab="Month of Year", ylab="Rainfall (mm)", pch=1) points(rain$NewYork,pch=2) points(rain$London,pch=3) points(rain$Berlin,pch=4) legend("top", legend=c("Tokyo","New York","London","Berlin"), ncol=4, cex=0.8, bty="n", pch=1:4) Choosing line styles and width Similar to plotting point symbols, R provides simple ways to adjust the style of lines in graphs. Getting ready All you need to try out this recipe is to run R and type the recipe at the command prompt. You can also choose to save the recipe as a script so that you can use it again later on. We will again use the cityrain.csv data file. How to do it... Line styles can be set by using the lty and lwd arguments (for line type and width respectively) in the plot(), lines(), and par() commands. Let’s take our rainfall example and apply different line styles keeping the color the same: plot(rain$Tokyo, ylim=c(0,250), main="Monthly Rainfall in major cities", xlab="Month of Year", ylab="Rainfall (mm)", type="l", lty=1, lwd=2) lines(rain$NewYork,lty=2,lwd=2) lines(rain$London,lty=3,lwd=2) lines(rain$Berlin,lty=4,lwd=2) legend("top", legend=c("Tokyo","New York","London","Berlin"), ncol=4, cex=0.8, bty="n", lty=1:4, lwd=2)   How it works... Both line type and width can be set with numerical values as shown in the previous example. Line type number values correspond to types of lines: 0: blank 1: solid (default) 2: dashed 3: dotted 4: dotdash 5: longdash 6: twodash We can also use the character strings instead of numbers, for example, lty="dashed" instead of lty=2. The line width argument lwd takes positive numerical values. The default value is 1. In the example we used a value of 2, thus making the lines thicker than default.
Read more
  • 0
  • 0
  • 2053

article-image-adjusting-key-parameters-r
Packt
24 Jan 2011
6 min read
Save for later

Adjusting Key Parameters in R

Packt
24 Jan 2011
6 min read
R Graph Cookbook Detailed hands-on recipes for creating the most useful types of graphs in R – starting from the simplest versions to more advanced applications Learn to draw any type of graph or visual data representation in R Filled with practical tips and techniques for creating any type of graph you need; not just theoretical explanations All examples are accompanied with the corresponding graph images, so you know what the results look like Each recipe is independent and contains the complete explanation and code to perform the task as efficiently as possible       Setting colors of points, lines, and bars In this recipe we will learn the simplest way to change the colors of points, lines, and bars in scatter plots, line plots, histograms, and bar plots. Getting ready All you need to try out this recipe is to run R and type the recipe at the command prompt. You can also choose to save the recipe as a script so that you can use it again later on. How to do it... The simplest way to change the color of any graph element is by using the col argument. For example, the plot() function takes the col argument: plot(rnorm(1000), col="red")   The code file can be downloaded from here. If we choose plot type as line, then the color is applied to the plotted line. Let’s use the dailysales.csv example dataset. First, we need to load it: Sales <- read.csv("dailysales.csv",header=TRUE) plot(sales$units~as.Date(sales$date,"%d/%m/%y"), type="l", #Specify type of plot as l for line col="blue")   Similarly, the points() and lines() functions apply the col argument's value to the plotted points and lines respectively. barplot() and hist() also take the col argument and apply them to the bars. So the following code would produce a bar plot with blue bars: barplot(sales$ProductA~sales$City, col="blue") The col argument for boxplot() is applied to the color of the boxes plotted. How it works... The col argument automatically applies the specified color to the elements being plotted, based on the plot type. So, if we do not specify a plot type or choose points, then the color is applied to points. Similarly, if we choose plot type as line then the color is applied to the plotted line and if we use the col argument in the barplot() or histogram() commands, then the color is applied to the bars. col accepts names of colors such as red, blue, and black. The colors() (or colours()) function lists all the built-in colors (more than 650) available in R. We can also specify colors as hexadecimal codes such as #FF0000 (for red), #0000FF (for blue), and #000000 (for black). If you have ever made any web pages¸ you would know that these hex codes are used in HTML to represent colors. col can also take numeric values. When it is set to a numeric value, the color corresponding to that index in the current color palette is used. For example, in the default color palette the first color is black and the second color is red. So col=1 and col=2 refers to black and red respectively. Index 0 corresponds to the background color. There's more... In many settings, col can also take a vector of multiple colors, instead of a single color. This is useful if you wish to use more than one color in a graph. The heat.colors() function takes a number as an argument and returns a vector of those many colors. So heat.colors(5) produces a vector of five colors. Type the following at the R prompt: heat.colors(5) You should get the following output: [1] "#FF0000FF" "#FF5500FF" "#FFAA00FF" "#FFFF00FF" "#FFFF80FF" Those are five colors in the hexadecimal format. Another way of specifying a vector of colors is to construct one: barplot(as.matrix(sales[,2:4]), beside=T, legend=sales$City, col=c("red","blue","green","orange","pink"), border="white") In the example, we set the value of col to c("red","blue","green","orange","pink"), which is a vector of five colors. We have to take care to make a vector matching the length of the number of elements, in this case bars we are plotting. If the two numbers don’t match, R will 'recycle’ values by repeating colors from the beginning of the vector. For example, if we had fewer colors in the vector than the number of elements, say if we had four colors in the previous plot, then R would apply the four colors to the first four bars and then apply the first color to the fifth bar. This is called recycling in R: barplot(as.matrix(sales[,2:4]), beside=T, legend=sales$City, col=c("red","blue","green","orange"), border="white") In the example, both the bars for the first and last data rows (Seattle and Mumbai) would be of the same color (red), making it difficult to distinguish one from the other. One good way to ensure that you always have the correct number of colors is to find out the length of the number of elements first and pass that as an argument to one of the color palette functions. For example, if we did not know the number of cities in the example we have just seen; we could do the following to make sure the number of colors matches the number of bars plotted: barplot(as.matrix(sales[,2:4]), beside=T, legend=sales$City, col=heat.colors(length(sales$City)), border="white") We used the length() function to find out the length or the number of elements in the vector sales$City and passed that as the argument to heat.colors(). So, regardless of the number of cities we will always have the right number of colors. See also In the next four recipes, we will see how to change the colors of other elements. The fourth recipe is especially useful where we look at color combinations and palettes.
Read more
  • 0
  • 0
  • 2252
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at €14.99/month. Cancel anytime
article-image-microsoft-sql-server-2008-high-availability-understanding-domains-users-and-security
Packt
21 Jan 2011
14 min read
Save for later

Microsoft SQL Server 2008 High Availability: Understanding Domains, Users, and Security

Packt
21 Jan 2011
14 min read
Microsoft SQL Server 2008 High Availability Minimize downtime, speed up recovery, and achieve the highest level of availability and reliability for SQL server applications by mastering the concepts of database mirroring,log shipping,clustering, and replication  Install various SQL Server High Availability options in a step-by-step manner  A guide to SQL Server High Availability for DBA aspirants, proficient developers and system administrators  Learn the pre and post installation concepts and common issues you come across while working on SQL Server High Availability  Tips to enhance performance with SQL Server High Availability  External references for further study  Windows domains and domain users In the early era of Windows, operating system user were created standalone until Windows NT operating system hit the market. Windows NT, that is, Windows New Technology introduced some great feature to the world—including domains. A domain is a group of computers that run on Windows operating systems. Amongst them is a computer that holds all the information related to user authentication and user database and is called the domain controller (server), whereas every user who is part of this user database on the domain controller is called a domain user. Domain users have access to any resource across the domain and its subdomains with the privilege they have, unlike the standalone user who has access to the resources available to a specific system. With the release of Windows Server 2000, Microsoft released Active Directory (AD), which is now widely used with Windows operating system networks to store, authenticate, and control users who are part of the domain. A Windows domain uses various modes to authenticate users—encrypted passwords, various handshake methods such as PKI, Kerberos, EAP, SSL certificates, NAP, LDAP, and IP Sec policy—and makes it robust authentication. One can choose the authentication method that suits business needs and based on the environment. Let's now see various authentication methods in detail. Public Key Infrastructure (PKI): This is the most common method used to transmit data over insecure channels such as the Internet using digital certificates. It has generally two parts—the public and private keys. These keys are generated by a Certificate Authority, such as, Thawte. Public keys are stored in a directory where they are accessible by all parties. The public key is used by the message sender to send encrypted messages, which then can be decrypted using the private key. Kerberos: This is an authentication method used in client server architecture to authorize the client to use service(s) on a server in a network. In this method, when a client sends a request to use a service to a server, a request goes to the authentication server, which will generate a session key and a random value based on the username. This session key and a random value are then passed to the server, which grants or rejects the request. These sessions are for certain time period, which means for that particular amount of time the client can use the service without having to re-authenticate itself. Extensible Authentication Protocol (EAP): This is an authentication protocol generally used in wireless and point-to-point connections. SSL Certificates: A Secure Socket Layer certificate (SSL) is a digital certificate that is used to identify a website or server that provides a service to clients and sends the data in an encrypted form. SSL certificates are typically used by websites such as GMAIL. When we type a URL and press Enter, the web browser sends a request to the web server to identify itself. The web server then sends a copy of its SSL certificate, which is checked by the browser. If the browser trusts the certificate (this is generally done on the basis of the CA and Registration Authority and directory verification), it will send a message back to the server and in reply the web server sends an acknowledgement to the browser to start an encrypted session. Network Access Protection (NAP): This is a new platform introduced by Microsoft with the release of Windows Server 2008. It will provide access to the client, based on the identity of the client, the group it belongs to, and the level compliance it has with the policy defined by the Network Administrators. If the client doesn't have a required compliance level, NAP has mechanisms to bring the client to the compliance level dynamically and allow it to access the network. Lightweight Directory Access Protocol (LDAP): This is a protocol that runs over TCP/IP directly. It is a set of objects, that is, organizational units, printers, groups, and so on. When the client sends a request for a service, it queries the LDAP server to search for availability of information, and based on that information and level of access, it will provide access to the client. IP Security (IPSEC): IP Security is a set of protocols that provides security at the network layer. IP Sec provides two choices: Authentication Header: Here it encapsulates the authentication of the sender in a header of the network packet. Encapsulating Security Payload: Here it supports encryption of both the header and data. Now that we know basic information on Windows domains, domain users, and various authentication methods used with Windows servers, I will walk you through some of the basic and preliminary stuff about SQL Server security! Understanding SQL Server Security Security!! Now-a-days we store various kinds of information into databases and we just want to be sure that they are secured. Security is the most important word to the IT administrator and vital for everybody who has stored their information in a database as he/she needs to make sure that not a single piece of data should be made available to someone who shouldn't have access. Because all the information stored in the databases is vital, everyone wants to prevent unauthorized access to highly confidential data and here is how security implementation in SQL Server comes into the picture. With the release of SQL Server 2000, Microsoft (MS) has introduced some great security features such as authentication roles (fixed server roles and fixed database roles), application roles, various permissions levels, forcing protocol encryption, and so on, which are widely used by administrators to tighten SQL Server security. Basically, SQL Server security has two paradigms: one is SQL Server's own set of security measures and other is to integrate them with the domain. SQL Server has two methods of authentication. Windows authentication In Windows authentication mode, we can integrate domain accounts to authenticate users, and based on the group they are members of and level of access they have, DBAs can provide them access on the particular SQL server box. Whenever a user tries to access the SQL Server, his/her account is validated by the domain controller first, and then based on the permission it has, the domain controller allows or rejects the request; here it won't require separate login ID and password to authenticate. Once the user is authenticated, SQL server will allow access to the user based on the permission it has. These permissions are in form of Roles including Server, fixed DB Roles, and Application roles. Fixed Server Roles: These are security principals that have server-wide scope. Basically, fixed server roles are expected to manage the permissions at server level. We can add SQL logins, domain accounts, and domain groups to these roles. There are different roles that we can assign to a login, domain account, or group—the following table lists them. Fixed DB Roles: These are the roles that are assigned to a particular login for the particular database; its scope is limited to the database it has permission to. There are various fixed database roles, including the ones shown in the following table: Application Role: The Application role is a database principal that is widely used to assign user permissions for an application. For example, in a home-grown ERP, some users require only to view the data; we can create a role and add a db_datareader permission to it and then can add all those users who require read-only permission. Mixed Authentication: In the Mixed authentication mode, logins can be authenticated by the Windows domain controller or by SQL Server itself. DBAs can create logins with passwords in SQL Server. With the release of SQL Server 2005, MS has introduced password policies for SQL Server logins. Mixed mode authentication is used when one has to run a legacy application and it is not on the domain network. In my opinion, Windows authentication is good because we can use various handshake methods such as PKI, Kerberos, EAP, SSL NAP, LDAP, or IPSEC to tighten the security. SQL Server 2005 has enhancements in its security mechanisms. The most important features amongst them are password policy, native encryption, separation of users and schema, and no need to provide system administrator (SA) rights to run profiler. These are good things because SA is the super user, and with the power this account has, a user can do anything on the SQL box, including: The user can grant ALTER TRACE permission to users who require to run profiler The user can create login and users The user can grant or revoke permission to other users A schema is an object container that is owned by a user and is transferable to any other users. In earlier versions, objects are owned by users, so if the user leaves the company we cannot delete his/her account from the SQL box just because there is some object he/she has created. We first have to change the owner of that object and then we can delete that account. On the other hand, in the case of a schema, we could have dropped the user account because the object is owned by the schema. Now, SQL Server 2008 will give you more control over the configuration of security mechanisms. It allows you to configure metadata access, execution context, and auditing events using DDL triggers—the most powerful feature to audit any DDL event. If one wishes to know more about what we have seen till now, he/she can go through the following links: http://www.microsoft.com/sqlserver/2008/en/us/Security.aspx http://www.microsoft.com/sqlserver/2005/en/us/security-features.aspx http://technet.microsoft.com/en-us/library/cc966507.aspx In this section, we understood the basics of domains, domain users, and SQL Server security. We also learned why security is given so much emphasize these days. In the next section, we will understand the basics of clustering and its components. What is clustering? Before starting with SQL Server clustering, let's have a look at clustering in general and Windows clusters. The word Cluster itself is self-descriptive—a bunch or group. When two or more than two computers are connected to each other by means of a network and share some of the common resources to provide redundancy or performance improvement, they are known as a cluster of computers. Clustering is usually deployed when there is a critical business application running that needs to be available 24 X 7 or in terminology—High Availability. These clusters are known as Failover clusters because the primary goal to set up the cluster is to make services or business processes that are business critical available 24 X 7. MS Windows server Enterprise and Datacenter edition supports failover clustering. This is achieved by having two identical nodes connected to each other by means of private network or commonly used resources. In case of failure of any common resource or services, the first node (Active) passes the ownership to another node (Passive). SQL Server Clustering is built on top of Windows Clustering, which means before we go about installing SQL Server clustering, we should have Windows clustering installed. Before we start, let's understand the commonly used shared resources for the cluster server. Clusters with 2, 4, 8, 12 or 32 nodes can be built Windows Server 2008 R2. Clusters are categorized in the following manner: High-Availability Clusters: This type of cluster is known as a Failover cluster. High Availability clusters are implemented when the purpose is to provide highly available services. For implementing a failover or high availability cluster one may have up to 16 nodes in a Microsoft Cluster. Clustering in Windows operating systems was first introduced with the release of Windows NT 4.0 Enterprise Edition, and was enhanced gradually. Even though we can have non-identical hardware, we should use identical hardware. This is because if the node to which cluster fails over has lower configuration, then we might face degradation in performance. Load Balancing: This is the second form of cluster that can be configured. This type of cluster can be configured by linking multiple computers with each other and making use of each resource they need for operation. From the user's point of view, all of these servers/nodes linked to each other are different. However, it is collectively and virtually a single system, with the main goal being to balance work by sharing CPU, disk, and every possible resource among the linked nodes and that is why it is known as a Load Balancing cluster. SQL Server doesn't support this form of clustering. Compute Clusters: When computers are linked together with the purpose of using them for simulation for aircraft, they are known as a compute cluster. A well-known example is Beowulf computers. Grid Computing: This is one kind of clustering solution, but it is more often used when there is a dispersed location. This kind of cluster is called a Supercomputer or HPC. The main application is scientific research, academic, mathematical, or weather forecasting where lots of CPUs and disks are required—SETI@home is a well-known example. If we talk about SQL Server clusters, there are some cool new features that are added in the latest release of SQL Server 2008, although with the limitation that these features are available only if SQL Server 2008 is used with Windows Server 2008. So, let's have a glance at these features: Service SID: Service SIDs were introduced with Windows Vista and Windows Server 2008. They enable us to bind permissions directly to Windows services. In the earlier version of SQL Server 2005, we need to have a SQL Server Services account that is a member of a domain group so that it can have all the required permissions. This is not the case with SQL Server 2008 and we may choose Service SIDs to bypass the need to provision domain groups. Support for 16 nodes: We may add up to 16 nodes in our SQL Server 2008 cluster with SQL Server 2008 Enterprise 64-bit edition. New cluster validation: As a part of the installation steps, a new cluster validation step is implemented. Prior to adding a node into an existing cluster, this validation step checks whether or not the cluster environment is compatible. Mount Drives: Drive letters are no longer essential. If we have a cluster configured that has limited drive letters available, we may mount a drive and use it in our cluster environment, provided it is associated with the base drive letter. Geographically dispersed cluster node: This is the super-cool feature introduced with SQL Server 2008, which uses VLAN to establish connectivity with other cluster nodes. It breaks the limitation of having all clustered nodes at a single location. IPv6 Support: SQL Server 2008 natively supports IPv6, which increases network IP address size to 128 bit from 32 bit. DHCP Support: Windows server 2008 clustering introduced the use of DHCP-assigned IP addresses by the cluster services and it is supported by SQL Server 2008 clusters. It is recommended to use static IP addresses. This is because if some of our application depends on IP addresses, or in case of failure of renewing IP address from DHCP server, there would be a failure of IP address resources. iSCSI Support: Windows server 2008 supports iSCSI to be used as storage connection; in earlier versions, only SAN and fibre channels were supported.
Read more
  • 0
  • 0
  • 2434

article-image-introduction-database-mirroring-microsoft-sql-server-2008-high-availability
Packt
19 Jan 2011
5 min read
Save for later

Introduction to Database Mirroring in Microsoft SQL Server 2008 High Availability

Packt
19 Jan 2011
5 min read
Microsoft SQL Server 2008 High Availability Minimize downtime, speed up recovery, and achieve the highest level of availability and reliability for SQL server applications by mastering the concepts of database mirroring,log shipping,clustering, and replication  Install various SQL Server High Availability options in a step-by-step manner  A guide to SQL Server High Availability for DBA aspirants, proficient developers and system administrators  Learn the pre and post installation concepts and common issues you come across while working on SQL Server High Availability  Tips to enhance performance with SQL Server High Availability  External references for further study Introduction Mr. Young, who is the Principal DBA in XY Incorporation, a large manufacturing company, was asked to come up with a solution that could serve his company's needs to make a database server highly available, without a manual or minimal human intervention. He was also asked to keep in mind the limited budget the company has for the financial year. After careful research, he has come up with an idea to go with Database Mirroring as it provides an option of Automatic Failover—a cost effective solution. He has prepared a technical document for the management and peers, in order to make them understand how it works, based on tests he performed on virtual test servers. What is Database Mirroring Database Mirroring is an option that can be used to cater to the business need, in order to increase the availability of SQL Server database as standby, for it to be used as an alternate production server in the case of any emergency. As its name suggests, mirroring stands for making an exact copy of the data. Mirroring can be done onto a disk, website, or somewhere else. Similarly, Microsoft has introduced Database Mirroring with the launch of SQL Server 2005 post SP1, which performs the same function—making an exact copy of the database between two physically separate database servers. As Mirroring is a database-wide feature, it can be implemented per database instead of implementing it server wide. Disk Mirroring is a technology wherein data is stored on physically separate but identical hard disks at the same time called hardware array disk 1 or RAID 1. Different components of the Database Mirroring To install Database Mirroring, there are three components that are required. They are as follows: Principal Server: This is the database server that will send out the data to the participant server (we'll call it secondary/standby server) in the form of transactions. Secondary Server: This is the database server that receives all the transactions that are sent by the Principal Server in order to keep the database identical. Witness Server (optional): This server will continuously monitor the Principal and Secondary Server and will enable automatic failover. This is an optional server. To have the automatic failover feature, the Principal Server should be in the synchronous mode. How Database Mirroring works In Database Mirroring, every server is a known partner and they complement each other as Principal and Mirror. There will be only one Principal and only one Mirror at any given time. In reality, DML operations that are performed on the Principal Server are all re-performed at the Mirror server. As we all know, the data is written into the Log Buffer before it is written into data pages. Database Mirroring sends data that is written into Principal Server's Log Buffer simultaneously to the Mirror database. All these transactions are sent in a sequential manner and as quickly as possible. There are two different operating modes at which Database Mirroring operates—asynchronous and synchronous. Asynchronous a.k.a. High Performance mode The transactions are sent to the Secondary Server as soon as they are written into the Log Buffer. In this mode of operation, the data is first committed at the Principal Server before it actually is written into the Log Buffer of the Secondary Server. Hence this mode is called High Performance mode, but at the same time, it introduces a chance of data loss. Synchronous a.k.a. High Safety mode The transactions are sent to the Secondary Server as soon as they are written to the Log Buffer. These transactions are then committed simultaneously at both the ends. Prerequisites Let's now have a look at the prerequisite to have Database Mirroring in place. Please be cautious with the prerequisites, as a single missed requisite would result in a failure in installation. Recovery Mode: To implement Database Mirroring, the database should be in the Full Recovery mode. Initialization of database: The database on which to install Database Mirroring should be present in the mirror database. To achieve this, we can restore the most recent backup, followed by the transaction log with the NORECOVERY option. Database name: Ensure that the database name is the same for both, the principal as well as the mirror database. Compatibility: Ensure that the partner servers are on the same edition of the SQL Server. Database Mirroring is supported by the Standard and Enterprise edition. Synchronous mode with Automatic failover is an Enterprise-only feature. Disk space: Ensure that we have ample space available for the mirror database. Service accounts: Ensure that the service accounts that we have used are domain accounts and they have the required permission, that is, CONNECT permission to the endpoints. Listener port: These are TCP ports on which a Database Mirroring session is established between the Principal and Mirror server. Endpoints: These are the objects that are dedicated to Database Mirroring and enable the SQL Server to communicate over the network. Authentication: While both the Principal and Mirror servers talk to each other, they should authenticate each other. For this, the accounts that we use—local accounts or domain accounts—should have login and send message permissions. If the accounts we use are using local logins as service accounts, we must use Certificates to authenticate a connection request.
Read more
  • 0
  • 0
  • 2352

article-image-creating-line-graphs-r
Packt
17 Jan 2011
7 min read
Save for later

Creating Line Graphs in R

Packt
17 Jan 2011
7 min read
Adding customized legends for multiple line graphs Line graphs with more than one line, representing more than one variable, are quite common in any kind of data analysis. In this recipe we will learn how to create and customize legends for such graphs. Getting ready We will use the base graphics library for this recipe, so all you need to do is run the recipe at the R prompt. It is good practice to save your code as a script to use again later. How to do it... First we need to load the cityrain.csv example data file, which contains monthly rainfall data for four major cities across the world. You can download this file from here. We will use the cityrain.csv example dataset. rain<-read.csv("cityrain.csv") plot(rain$Tokyo,type="b",lwd=2, xaxt="n",ylim=c(0,300),col="black", xlab="Month",ylab="Rainfall (mm)", main="Monthly Rainfall in major cities") axis(1,at=1:length(rain$Month),labels=rain$Month) lines(rain$Berlin,col="red",type="b",lwd=2) lines(rain$NewYork,col="orange",type="b",lwd=2) lines(rain$London,col="purple",type="b",lwd=2) legend("topright",legend=c("Tokyo","Berlin","New York","London"), lty=1,lwd=2,pch=21,col=c("black","red","orange","purple"), ncol=2,bty="n",cex=0.8, text.col=c("black","red","orange","purple"), inset=0.01) How it works... We used the legend() function. It is quite a flexible function and allows us to adjust the placement and styling of the legend in many ways. The first argument we passed to legend() specifies the position of the legend within the plot region. We used "topright"; other possible values are "bottomright", "bottom", "bottomleft", "left", "topleft", "top", "right", and "center". We can also specify the location of legend with x and y co-ordinates as we will soon see. The other important arguments specific to lines are lwd and lty which specify the line width and type drawn in the legend box respectively. It is important to keep these the same as the corresponding values in the plot() and lines() commands. We also set pch to 21 to replicate the type="b" argument in the plot() command. cex and text.col set the size and colors of the legend text. Note that we set the text colors to the same colors as the lines they represent. Setting bty (box type) to "n" ensures no box is drawn around the legend. This is good practice as it keeps the look of the graph clean. ncol sets the number of columns over which the legend labels are spread and inset sets the inset distance from the margins as a fraction of the plot region. There's more... Let's experiment by changing some of the arguments discussed: legend(1,300,legend=c("Tokyo","Berlin","New York","London"), lty=1,lwd=2,pch=21,col=c("black","red","orange","purple"), horiz=TRUE,bty="n",bg="yellow",cex=1, text.col=c("black","red","orange","purple")) This time we used x and y co-ordinates instead of a keyword to position the legend. We also set the horiz argument to TRUE. As the name suggests, horiz makes the legend labels horizontal instead of the default vertical. Specifying horiz overrides the ncol argument. Finally, we made the legend text bigger by setting cex to 1 and did not use the inset argument. An alternative way of creating the previous plot without having to call plot() and lines() multiple times is to use the matplot() function. To see details on how to use this function, please see the help file by running ?matplot or help(matplot) at the R prompt. Using margin labels instead of legends for multiple line graphs While legends are the most commonly used method of providing a key to read multiple variable graphs, they are often not the easiest to read. Labelling lines directly is one way of getting around that problem. Getting ready We will use the base graphics library for this recipe, so all you need to do is run the recipe at the R prompt. It is good practice to save your code as a script to use again later. How to do it... Let's use the gdp.txt example dataset to look at the trends in the annual GDP of five countries: gdp<-read.table("gdp_long.txt",header=T) library(RColorBrewer) pal<-brewer.pal(5,"Set1") par(mar=par()$mar+c(0,0,0,2),bty="l") plot(Canada~Year,data=gdp,type="l",lwd=2,lty=1,ylim=c(30,60), col=pal[1],main="Percentage change in GDP",ylab="") mtext(side=4,at=gdp$Canada[length(gdp$Canada)],text="Canada", col=pal[1],line=0.3,las=2) lines(gdp$France~gdp$Year,col=pal[2],lwd=2) mtext(side=4,at=gdp$France[length(gdp$France)],text="France", col=pal[2],line=0.3,las=2) lines(gdp$Germany~gdp$Year,col=pal[3],lwd=2) mtext(side=4,at=gdp$Germany[length(gdp$Germany)],text="Germany", col=pal[3],line=0.3,las=2) lines(gdp$Britain~gdp$Year,col=pal[4],lwd=2) mtext(side=4,at=gdp$Britain[length(gdp$Britain)],text="Britain", col=pal[4],line=0.3,las=2) lines(gdp$USA~gdp$Year,col=pal[5],lwd=2) mtext(side=4,at=gdp$USA[length(gdp$USA)]-2, text="USA",col=pal[5],line=0.3,las=2) How it works... We first read the gdp.txt data file using the read.table() function. Next we loaded the RColorBrewer color palette library and set our color palette pal to "Set1" (with five colors). Before drawing the graph, we used the par() command to add extra space to the right margin, so that we have enough space for the labels. Depending on the size of the text labels you may have to experiment with this margin until you get it right. Finally, we set the box type (bty) to an L-shape ("l") so that there is no line on the right margin. We can also set it to "c" if we want to keep the top line. We used the mtext() function to label each of the lines individually in the right margin. The first argument we passed to the function is the side where we want the label to be placed. Sides (margins) are numbered starting from 1 for the bottom side and going round in a clockwise direction so that 2 is left, 3 is top, and 4 is right. The at argument was used to specify the Y co-ordinate of the label. This is a bit tricky because we have to make sure we place the label as close to the corresponding line as possible. So, here we have used the last value of each line. For example, gdp$France[length(gdp$France) picks the last value in the France vector by using its length as the index. Note that we had to adjust the value for USA by subtracting 2 from its last value so that it doesn't overlap the label for Canada. We used the text argument to set the text of the labels as country names. We set the col argument to the appropriate element of the pal vector by using a number index. The line argument sets an offset in terms of margin lines, starting at 0 counting outwards. Finally, setting las to 2 rotates the labels to be perpendicular to the axis, instead of the default value of 1 which makes them parallel to the axis. Sometimes, simply using the last value of a set of values may not work because the value may be missing. In that case we can use the second last value or visually choose a value that places the label closest to the line. Also, the size of the plot window and the proximity of the final values may cause overlapping of labels. So, we may need to iterate a few times before we get the placement right. We can write functions to automate this process but it is still good to visually inspect the outcome.
Read more
  • 0
  • 0
  • 3856

article-image-regular-expressions-python-26-text-processing
Packt
13 Jan 2011
17 min read
Save for later

Regular Expressions in Python 2.6 Text Processing

Packt
13 Jan 2011
17 min read
Python 2.6 Text Processing: Beginners Guide Simple string matching Regular expressions are notoriously hard to read, especially if you're not familiar with the obscure syntax. For that reason, let's start simple and look at some easy regular expressions at the most basic level. Before we begin, remember that Python raw strings allow us to include backslashes without the need for additional escaping. Whenever you define regular expressions, you should do so using the raw string syntax. Time for action – testing an HTTP URL In this example, we'll check values as they're entered via the command line as a means to introduce the technology. We'll dive deeper into regular expressions as we move forward. We'll be scanning URLs to ensure our end users inputted valid data. Create a new file and name it number_regex.py. Enter the following code: import sys import re # Make sure we have a single URL argument. if len(sys.argv) != 2: print >>sys.stderr, "URL Required" sys.exit(-1) # Easier access. url = sys.argv[1] # Ensure we were passed a somewhat valid URL. # This is a superficial test. if re.match(r'^https?:/{2}w.+$', url): print "This looks valid" else: print "This looks invalid" Now, run the example script on the command line a few times, passing various different values to it on the command line. (text_processing)$ python url_regex.py http://www.jmcneil.net This looks valid (text_processing)$ python url_regex.py http://intranet This looks valid (text_processing)$ python url_regex.py http://www.packtpub.com This looks valid (text_processing)$ python url_regex.py https://store This looks valid (text_processing)$ python url_regex.py httpsstore This looks invalid (text_processing)$ python url_regex.py https:??store This looks invalid (text_processing)$ What just happened? We took a look at a very simple pattern and introduced you to the plumbing needed to perform a match test. Let's walk through this little example, skipping the boilerplate code. First of all, we imported the re module. The re module, as you probably inferred from the name, contains all of Python's regular expression support. Any time you need to work with regular expressions, you'll need to import the re module. Next, we read a URL from the command line and bind a temporary attribute, which makes for cleaner code. Directly below that, you should notice a line that reads re.match(r'^https?:/{2}w.+$', url). This line checks to determine whether the string referenced by the url attribute matches the ^https?:/{2}w.+$ pattern. If a match is found, we'll print a success message; otherwise, the end user would receive some negative feedback indicating that the input value is incorrect. This example leaves out a lot of details regarding HTTP URL formats. If you were performing validation on user input, one place to look would be http://formencode.org/. FormEncode is a HTML form-processing and data-validation framework written by Ian Bicking. Understanding the match function The most basic method of testing for a match is via the re.match function, as we did in the previous example. The match function takes a regular expression pattern and a string value. For example, consider the following snippet of code: Python 2.6.1 (r261:67515, Feb 11 2010, 00:51:29) [GCC 4.2.1 (Apple Inc. build 5646)] on darwin Type "help", "copyright", "credits", or "license" for more information. >>> import re >>> re.match(r'pattern', 'pattern') <_sre.SRE_Match object at 0x1004811d0> >>> Here, we simply passed a regular expression of "pattern" and a string literal of "pattern" to the re.match function. As they were identical, the result was a match. The returned Match object indicates the match was successful. The re.match function returns None otherwise. >>> re.match(r'pattern', 'failure') >>> Learning basic syntax A regular expression is generally a collection of literal string data and special metacharacters that represents a pattern of text. The simplest regular expression is just literal text that only matches itself. In addition to literal text, there are a series of special characters that can be used to convey additional meaning, such as repetition, sets, wildcards, and anchors. Generally, the punctuation characters field this responsibility. Detecting repetition When building up expressions, it's useful to be able to match certain repeating patterns without needing to duplicate values. It's also beneficial to perform conditional matches. This lets us check for content such as "match the letter a, followed by the number one at least three times, but no more than seven times." For example, the code below does just that: Python 2.6.1 (r261:67515, Feb 11 2010, 00:51:29) [GCC 4.2.1 (Apple Inc. build 5646)] on darwin Type "help", "copyright", "credits", or "license" for more information. >>> import re >>> re.match(r'^a1{3,7}$', 'a1111111') <_sre.SRE_Match object at 0x100481648> >>> re.match(r'^a1{3,7}$', '1111111') >>> If the repetition operator follows a valid regular expression enclosed in parenthesis, it will perform repetition on that entire expression. For example: >>> re.match(r'^(a1){3,7}$', 'a1a1a1') <_sre.SRE_Match object at 0x100493918> >>> re.match(r'^(a1){3,7}$', 'a11111') >>> The following table details all of the special characters that can be used for marking repeating values within a regular expression. Specifying character sets and classes In some circumstances, it's useful to collect groups of characters into a set such that any of the values in the set will trigger a match. It's also useful to match any character at all. The dot operator does just that. A character set is enclosed within standard square brackets. A set defines a series of alternating (or) entities that will match a given text value. If the first character within a set is a caret (^) then a negation is performed. All characters not defined by that set would then match. There are a couple of additional interesting set properties. For ranged values, it's possible to specify an entire selection using a hyphen. For example, '[0-6a-d]' would match all values between 0 and 6, and a and d. Special characters listed within brackets lose their special meaning. The exceptions to this rule are the hyphen and the closing bracket. If you need to include a closing bracket or a hyphen within a regular expression, you can either place them as the first elements in the set or escape them by preceding them with a backslash. As an example, consider the following snippet, which matches a string containing a hexadecimal number. Python 2.6.1 (r261:67515, Feb 11 2010, 00:51:29) [GCC 4.2.1 (Apple Inc. build 5646)] on darwin Type "help", "copyright", "credits", or "license" for more information. >>> import re >>> re.match(r'^0x[a-f0-9]+$', '0xff') <_sre.SRE_Match object at 0x100481648> >>> re.match(r'^0x[a-f0-9]+$', '0x01') <_sre.SRE_Match object at 0x1004816b0> >>> re.match(r'^0x[a-f0-9]+$', '0xz') >>> In addition to the bracket notation, Python ships with some predefined classes. Generally, these are letter values prefixed with a backslash escape. When they appear within a set, the set includes all values for which they'll match. The d escape matches all digit values. It would have been possible to write the above example in a slightly more compact manner. >>> re.match(r'^0x[a-fd]+$', '0x33') <_sre.SRE_Match object at 0x100481648> >>> re.match(r'^0x[a-fd]+$', '0x3f') <_sre.SRE_Match object at 0x1004816b0> >>> The following table outlines all of the character sets and classes available: One thing that should become apparent is that lowercase classes are matches whereas their uppercase counterparts are the inverse. Applying anchors to restrict matches There are times where it's important that patterns match at a certain position within a string of text. Why is this important? Consider a simple number validation test. If a user enters a digit, but mistakenly includes a trailing letter, an expression checking for the existence of a digit alone will pass. Python 2.6.1 (r261:67515, Feb 11 2010, 00:51:29) [GCC 4.2.1 (Apple Inc. build 5646)] on darwin Type "help", "copyright", "credits", or "license" for more information. >>> import re >>> re.match(r'd', '1f') <_sre.SRE_Match object at 0x1004811d0> >>> Well, that's unexpected. The regular expression engine sees the leading '1' and considers it a match. It disregards the rest of the string as we've not instructed it to do anything else with it. To fix the problem that we have just seen, we need to apply anchors. >>> re.match(r'^d$', '6') <_sre.SRE_Match object at 0x100481648> >>> re.match(r'^d$', '6f') >>> Now, attempting to sneak in a non-digit character results in no match. By preceding our expression with a caret (^) and terminating it with a dollar sign ($), we effectively said "between the start and the end of this string, there can only be one digit." Anchors, among various other metacharacters, are considered zero-width matches. Basically, this means that a match doesn't advance the regular expression engine within the test string. We're not limited to the either end of a string, either. Here's a collection of all of the available anchors provided by Python. Wrapping it up Now that we've covered the basics of regular expression syntax, let's double back and take a look at the expression we used in our first example. It might be a bit easier if we break it down a bit more with a diagram. Now that we've provided a bit of background, this pattern should make sense. We begin the regular expression with a caret, which matches the beginning of the string. The very next element is the literal http. As our caret matches the start of a string and must be immediately followed by http, this is equivalent to saying that our string must start with http. Next, we include a question mark after the s in https. The question mark states that the previous entity should be matched either zero, or one time. By default, the evaluation engine is looking character-by-character, so the previous entity in this case is simply "s." We do this so our test passes for both secure and non-secure addresses. As we advanced forward in our string, the next special term we run into is {2}, and it follows a simple forward slash. This says that the forward slash should appear exactly two times. Now, in the real world, it would probably make more sense to simply type the second slash. Using the repetition check like this not only requires more typing, but it also causes the regular expression engine to work harder. Immediately after the repetition match, we include a w. The w, if you'll remember from the previous tables, expands to [0-9a-zA-Z_], or any word character. This is to ensure that our URL doesn't begin with a special character. The dot character after the w matches anything, except a new line. Essentially, we're saying "match anything else, we don't so much care." The plus sign states that the preceding wild card should match at least once. Finally, we're anchoring the end of the string. However, in this example, this isn't really necessary. Have a go hero – tidying up our URL test There are a few intentional inconsistencies and problems with this regular expression as designed. To name a few: Properly formatted URLs should only contain a few special characters. Other values should be URL-encoded using percent escapes. This regular expression doesn't check for that. It's possible to include newline characters towards the end of the URL, which is clearly not supported by any browsers! The w followed by the. + implicitly set a minimum limit of two characters after the protocol specification. A single letter is perfectly valid. You guessed it. Using what we've covered thus far, it should be possible for you to backtrack and update our regular expression in order to fix these flaws. For more information on what characters are allowed, have a look at http://www.w3schools.com/tags/ref_urlencode.asp. Advanced pattern matching In addition to basic pattern matching, regular expressions let us handle some more advanced situations as well. It's possible to group characters for purposes of precedence and reference, perform conditional checks based on what exists later, or previously, in a string, and limit exactly how much of a match actually constitutes a match. Don't worry; we'll clarify that last phrase as we move on. Let's go! Grouping When crafting a regular expression string, there are generally two reasons you would wish to group expression components together: entity precedence or to enable access to matched parts later in your application. Time for action – regular expression grouping In this example, we'll return to our LogProcessing application. Here, we'll update our log split routines to divide lines up via a regular expression as opposed to simple string manipulation. In core.py, add an import re statement to the top of the file. This makes the regular expression engine available to us. Directly above the __init__ method definition for LogProcessor, add the following lines of code. These have been split to avoid wrapping. _re = re.compile( r'^([d.]+) (S+) (S+) [([w/:+ ]+)] "(.+?)" ' r'(?P<rcode>d{3}) (S+) "(S+)" "(.+)"') Now, we're going to replace the split method with one that takes advantage of the new regular expression: def split(self, line): """ Split a logfile. Uses a simple regular expression to parse out the Apache logfile entries. """ line = line.strip() match = re.match(self._re, line) if not match: raise ParsingError("Malformed line: " + line) return { 'size': 0 if match.group(6) == '-' else int(match.group(6)), 'status': match.group('rcode'), 'file_requested': match.group(5).split()[1] } Running the logscan application should now produce the same output as it did when we were using a more basic, split-based approach. (text_processing)$ cat example3.log | logscan -c logscan.cfg What just happened? First of all, we imported the re module so that we have access to Python's regular expression services. Next, at the LogProcessor class level, we defined a regular expression. Though, this time we did so via re.compile rather than a simple string. Regular expressions that are used more than a handful of times should be "prepared" by running them through re.compile first. This eases the load placed on the system by frequently used patterns. The re.compile function returns a SRE_Pattern object that can be passed in just about anywhere you can pass in a regular expression. We then replace our split method to take advantage of regular expressions. As you can see, we simply pass self._re in as opposed to a string-based regular expression. If we don't have a match, we raise a ParsingError, which bubbles up and generates an appropriate error message, much like we would see on an invalid split case. Now, the end of the split method probably looks somewhat peculiar to you. Here, we've referenced our matched values via group identification mechanisms rather than by their list index into the split results. Regular expression components surrounded by parenthesis create a group, which can be accessed via the group method on the Match object later down the road. It's also possible to access a previously matched group from within the same regular expression. Let's look at a somewhat smaller example. >>> match = re.match(r'(0x[0-9a-f]+) (?P<two>1)', '0xff 0xff') >>> match.group(1) '0xff' >>> match.group(2) '0xff' >>> match.group('two') '0xff' >>> match.group('failure') Traceback (most recent call last): File "<stdin>", line 1, in <module> IndexError: no such group >>> Here, we surround two distinct regular expressions components with parenthesis, (0x[0-9a-f]+), and (?P&lttwo>1). The first regular expression matches a hexadecimal number. This becomes group ID 1. The second expression matches whatever was found by the first, via the use of the 1. The "backslash-one" syntax references the first match. So, this entire regular expression only matches when we repeat the same hexadecimal number twice, separated with a space. The ?P&lttwo> syntax is detailed below. As you can see, the match is referenced after-the-fact using the match.group method, which takes a numeric index as its argument. Using standard regular expressions, you'll need to refer to a matched group using its index number. However, if you'll look at the second group, we added a (?P&ltname>) construct. This is a Python extension that lets us refer to groupings by name, rather than by numeric group ID. The result is that we can reference groups of this type by name as opposed to simple numbers. Finally, if an invalid group ID is passed in, an IndexError exception is thrown. The following table outlines the characters used for building groups within a Python regular expression: Finally, it's worth pointing out that parenthesis can also be used to alter priority as well. For example, consider this code. >>> re.match(r'abc{2}', 'abcc') <_sre.SRE_Match object at 0x1004818b8> >>> re.match(r'a(bc){2}', 'abcc') >>> re.match(r'a(bc){2}', 'abcbc') <_sre.SRE_Match object at 0x1004937b0> >>> Whereas the first example matches c exactly two times, the second and third line require us to repeat bc twice. This changes the meaning of the regular expression from "repeat the previous character twice" to "repeat the previous match within parenthesis twice." The value within the group could have been its own complex regular expression, such as a([b-c]) {2}. Have a go hero – updating our stats processor to use named groups Spend a couple of minutes and update our statistics processor to use named groups rather than integer-based references. This makes it slightly easier to read the assignment code in the split method. You do not need to create names for all of the groups, simply the ones we're actually using will do. Using greedy versus non-greedy operators Regular expressions generally like to match as much text as possible before giving up or yielding to the next token in a pattern string. If that behavior is unexpected and not fully understood, it can be difficult to get your regular expression correct. Let's take a look at a small code sample to illustrate the point. Suppose that with your newfound knowledge of regular expressions, you decided to write a small script to remove the angled brackets surrounding HTML tags. You might be tempted to do it like this: >>> match = re.match(r'(?P<tag><.+>)', '<title>Web Page</title>') >>> match.group('tag') '<title>Web Page</title>' >>> The result is probably not what you expected. The reason we got this result was due to the fact that regular expressions are greedy by nature. That is, they'll attempt to match as much as possible. If you look closely, &lttitle> is a match for the supplied regular expression, as is the entire &lttitle&gtWeb Page</title> string. Both start with an angled-bracket, contain at least one character, and both end with an angled bracket. The fix is to insert the question mark character, or the non-greedy operator, directly after the repetition specification. So, the following code snippet fixes the problem. >>> match = re.match(r'(?P<tag><.+?>)', '<title>Web Page</title>') >>> match.group('tag') '<title>' >>> The question mark changes our meaning from "match as much as you possibly can" to "match only the minimum required to actually match."
Read more
  • 0
  • 0
  • 2755
article-image-microsoft-sql-server-2008-high-availability-installing-database-mirroring
Packt
11 Jan 2011
4 min read
Save for later

Microsoft SQL Server 2008 High Availability: Installing Database Mirroring

Packt
11 Jan 2011
4 min read
  Microsoft SQL Server 2008 High Availability Minimize downtime, speed up recovery, and achieve the highest level of availability and reliability for SQL server applications by mastering the concepts of database mirroring,log shipping,clustering, and replication  Install various SQL Server High Availability options in a step-by-step manner  A guide to SQL Server High Availability for DBA aspirants, proficient developers and system administrators  Learn the pre and post installation concepts and common issues you come across while working on SQL Server High Availability  Tips to enhance performance with SQL Server High Availability  External references for further study         Introduction First let's briefly see what is Database Mirroring. Database Mirroring is an option that can be used to cater to the business need, in order to increase the availability of SQL Server database as standby, for it to be used as an alternate production server in the case of any emergency. As its name suggests, mirroring stands for making an exact copy of the data. Mirroring can be done onto a disk, website, or somewhere else. Now let's move on to the topic of this article – installation of Database Mirroring. Preparing for Database Mirroring Before we move forward, we shall prepare the database for the Database Mirroring. Here are the steps: The first step is to ensure that the database is in Full Recovery mode. You can set the mode to “Full Recovery” using the following code: Execute the backup command, followed by the transaction log backup command, and move the backups to the server we wish to have as a mirror. I have run the RESTORE VERIFYONLY command after backup completes. This command ensures the validity of a backup file. It is recommended to always verify the backup. As we have a full database and log backup file, move them over to the Mirror server that we have identified. We will now perform the database restoration, followed by the restore log command with NORECOVERY. It is necessary to use the NORECOVERY option so that additional log backups or transactions can be applied. Installing Database Mirroring As the database that we want to participate in the Database Mirroring is now ready, we can move on with the actual installation process. Right-click on the database we want to mirror and select Tasks | Mirror..... It will open the following screen. To start with the actual setup, click on the Configure Security... button. In this dialog box, select the No option as we are not including the Witness Server at this stage and will be performing this task later. In the next dialog box, connect to the Principal Server. Specify the Listener Port and Endpoint name, and click Next. We are now asked to configure the property for the Mirror Server, Listener port, and Endpoint name. In this step, the installation wizard asks us to specify the service account that will be used by the Database Mirroring operation. If a person is using local system account as a service account, he/she must use Certificates for authentication. Generally, these certificates are used by the websites to assure their users that the information is secured. Certificates are the digital documents that store digital signature or identity information of the holder for authenticity purpose. They ensure that every byte of information being sent over the internet/intranet/vpn, and stored at the server, is safe. Certificates are installed at the servers, either by obtaining them from the providers such as http://www.thwate.com or can be self-issued by Database Administrator or Chief Information Officer of the company using the httpcfg.exe utility. The same is true for SQL Server. SQL Server uses certificates to ensure that the information is secured and these certificates can be issued by self, using httpcfg.exe, or can be obtained from issuing authority. In the next dialog box, make sure that the configuration details we have furnished are valid. Ensure that the name of the Principal and Mirror Server, Endpoints, and port number are correct. Click Finish. Ensure that the setup wizard returns a success report at the end.
Read more
  • 0
  • 0
  • 1657

article-image-getting-started-python-26-text-processing
Packt
10 Jan 2011
14 min read
Save for later

Getting Started with Python 2.6 Text Processing

Packt
10 Jan 2011
14 min read
  Python 2.6 Text Processing: Beginners Guide The easiest way to learn how to manipulate text with Python The easiest way to learn text processing with Python Deals with the most important textual data formats you will encounter Learn to use the most popular text processing libraries available for Python Packed with examples to guide you through         Read more about this book       (For more resources on this subject, see here.) Categorizing types of text data Textual data comes in a variety of formats. For our purposes, we'll categorize text into three very broad groups. Isolating down into segments helps us to understand the problem a bit better, and subsequently choose a parsing approach. Each one of these sweeping groups can be further broken down into more detailed chunks. One thing to remember when working your way through the book is that text content isn't limited to the Latin alphabet. This is especially true when dealing with data acquired via the Internet. Providing information through markup Structured text includes formats such as XML and HTML. These formats generally consist of text content surrounded by special symbols or markers that give extra meaning to a file's contents. These additional tags are usually meant to convey information to the processing application and to arrange information in a tree-like structure. Markup allows a developer to define his or her own data structure, yet rely on standardized parsers to extract elements. For example, consider the following contrived HTML document. <html> <head> <title>Hello, World!</title> </head> <body> <p> Hi there, all of you earthlings. </p> <p> Take us to your leader. </p> </body></html> In this example, our document's title is clearly identified because it is surrounded by opening and closing &lttitle> and </title> elements. Note that although the document's tags give each element a meaning, it's still up to the application developer to understand what to do with a title object or a p element. Notice that while it still has meaning to us humans, it is also laid out in such a way as to make it computer friendly. One interesting aspect to these formats is that it's possible to embed references to validation rules as well as the actual document structure. This is a nice benefit in that we're able to rely on the parser to perform markup validation for us. This makes our job much easier as it's possible to trust that the input structure is valid. Meaning through structured formats Text data that falls into this category includes things such as configuration files, marker delimited data, e-mail message text, and JavaScript Object Notation web data. Content within this second category does not contain explicit markup much like XML and HTML does, but the structure and formatting is required as it conveys meaning and information about the text to the parsing application. For example, consider the format of a Windows INI file or a Linux system's /etc/hosts file. There are no tags, but the column on the left clearly means something other than the column on the right. Python provides a collection of modules and libraries intended to help us handle popular formats from this category. Understanding freeform content This category contains data that does not fall into the previous two groupings. This describes e-mail message content, letters, article copy, and other unstructured character-based content. However, this is where we'll largely have to look at building our own processing components. There are external packages available to us if we wish to perform common functions. Some examples include full text searching and more advanced natural language processing. Ensuring you have Python installed Our first order of business is to ensure that you have Python installed. We'll be working with Python 2.6 and we assume that you're using that same version. If there are any drastic differences in earlier releases, we'll make a note of them as we go along. All of the examples should still function properly with Python 2.4 and later versions. If you don't have Python installed, you can download the latest 2.X version from http://www.python.org. Most Linux distributions, as well as Mac OS, usually have a version of Python preinstalled. At the time of this writing, Python 2.6 was the latest version available, while 2.7 was in an alpha state. Providing support for Python 3 The examples in this book are written for Python 2. However, wherever possible, we will provide code that has already been ported to Python 3. You can find the Python 3 code in the Python3 directories in the code bundle available on the Packt Publishing FTP site. Unfortunately, we can't promise that all of the third-party libraries that we'll use will support Python 3. The Python community is working hard to port popular modules to version 3.0. However, as the versions are incompatible, there is a lot of work remaining. In situations where we cannot provide example code, we'll note this. Implementing a simple cipher Let's get going early here and implement our first script to get a feel for what's in store. A Caesar Cipher is a simple form of cryptography in which each letter of the alphabet is shifted down by a number of letters. They're generally of no cryptographic use when applied alone, but they do have some valid applications when paired with more advanced techniques. This preceding diagram depicts a cipher with an offset of three. Any X found in the source data would simply become an A in the output data. Likewise, any A found in the input data would become a D. Time for action – implementing a ROT13 encoder The most popular implementation of this system is ROT13. As its name suggests, ROT13 shifts – or rotates – each letter by 13 spaces to produce an encrypted result. As the English alphabet has 26 letters, we simply run it a second time on the encrypted text in order to get back to our original result. Let's implement a simple version of that algorithm. Start your favorite text editor and create a new Python source file. Save it as rot13.py. Enter the following code exactly as you see it below and save the file. import sysimport stringCHAR_MAP = dict(zip( string.ascii_lowercase, string.ascii_lowercase[13:26] + string.ascii_lowercase[0:13] ))def rotate13_letter(letter): """ Return the 13-char rotation of a letter. """ do_upper = False if letter.isupper(): do_upper = True letter = letter.lower() if letter not in CHAR_MAP: return letter else: letter = CHAR_MAP[letter] if do_upper: letter = letter.upper() return letterif __name__ == '__main__': for char in sys.argv[1]: sys.stdout.write(rotate13_letter(char)) sys.stdout.write('n') Now, from a command line, execute the script as follows. If you've entered all of the code correctly, you should see the same output. $ python rot13.py 'We are the knights who say, nee!' Run the script a second time, using the output of the first run as the new input string. If everything was entered correctly, the original text should be printed to the console. $ python rot13.py 'Dv ziv gsv pmrtsgh dsl hzb, mvv!' What just happened? We implemented a simple text-oriented cipher using a collection of Python's string handling features. We were able to see it put to use for both encoding and decoding source text. We saw a lot of stuff in this little example, so you should have a good feel for what can be accomplished using the standard Python string object. Following our initial module imports, we defined a dictionary named CHAR_MAP, which gives us a nice and simple way to shift our letters by the required 13 places. The value of a dictionary key is the target letter! We also took advantage of string slicing here. In our translation function rotate13_letter, we checked whether our input character was uppercase or lowercase and then saved that as a Boolean attribute. We then forced our input to lowercase for the translation work. As ROT13 operates on letters alone, we only performed a rotation if our input character was a letter of the Latin alphabet. We allowed other values to simply pass through. We could have just as easily forced our string to a pure uppercased value. The last thing we do in our function is restore the letter to its proper case, if necessary. This should familiarize you with upper- and lowercasing of Python ASCII strings. We're able to change the case of an entire string using this same method; it's not limited to single characters. >>> name = 'Ryan Miller'>>> name.upper()'RYAN MILLER'>>> "PLEASE DO NOT SHOUT".lower()'please do not shout'>>> It's worth pointing out here that a single character string is still a string. There is not a char type, which you may be familiar with if you're coming from a different language such as C or C++. However, it is possible to translate between character ASCII codes and back using the ord and chr built-in methods and a string with a length of one. Notice how we were able to loop through a string directly using the Python for syntax. A string object is a standard Python iterable, and we can walk through them detailed as follows. In practice, however, this isn't something you'll normally do. In most cases, it makes sense to rely on existing libraries. $ pythonPython 2.6.1 (r261:67515, Jul 7 2009, 23:51:51)[GCC 4.2.1 (Apple Inc. build 5646)] on darwinType "help", "copyright", "credits" or "license" for more information.>>> for char in "Foo":... print char...Foo>>> Finally, you should note that we ended our script with an if statement such as the following: Python modules all contain an internal __name__ variable that corresponds to the name of the module. If a module is executed directly from the command line, as is this script, whose name value is set to __main__, this code only runs if we've executed this script directly. It will not run if we import this code from a different script. You can import the code directly from the command line and see for yourself. >>> if__name__ == '__main__' Notice how we were able to import our module and see all of the methods and attributes inside of it, but the driver code did not execute. Have a go hero – more translation work Each Python string instance contains a collection of methods that operate on one or more characters. You can easily display all of the available methods and attributes by using the dir method. For example, enter the following command into a Python window. Python responds by printing a list of all methods on a string object. $ pythonPython 2.6.1 (r261:67515, Jul 7 2009, 23:51:51)[GCC 4.2.1 (Apple Inc. build 5646)] on darwinType "help", "copyright", "credits" or "license" for more information.>>> import rot13>>> dir(rot13)['CHAR_MAP', '__builtins__', '__doc__', '__file__', '__name__', '__package__', 'rotate13_letter', 'string', 'sys']>>> Much like the isupper and islower methods discussed previously, we also have an isspace method. Using this method, in combination with your newfound knowledge of Python strings, update the method we defined previously to translate spaces to underscores and underscores to spaces. Processing structured markup with a filter Our ROT13 application works great for simple one-line strings that we can fit on the command line. However, it wouldn't work very well if we wanted to encode an entire file, such as the HTML document we took a look at earlier. In order to support larger text documents, we'll need to change the way we accept input. We'll redesign our application to work as a filter. A filter is an application that reads data from its standard input file descriptor and writes to its standard output file descriptor. This allows users to create command pipelines that allow multiple utilities to be strung together. If you've ever typed a command such as cat /etc/hosts | grep mydomain.com, you've set up a pipeline In many circumstances, data is fed into the pipeline via the keyboard and completes its journey when a processed result is displayed on the screen. Time for action – processing as a filter Let's make the changes required to allow our simple ROT13 processor to work as a command-line filter. This will allow us to process larger files. Create a new source file and enter the following code. When complete, save the file as rot13-b.py. >>> dir("content")['__add__', '__class__', '__contains__', '__delattr__', '__doc__','__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__','__getnewargs__', '__getslice__', '__gt__', '__hash__', '__init__', '__le__', '__len__', '__lt__', '__mod__', '__mul__', '__ne__', '__new__','__reduce__', '__reduce_ex__', '__repr__', '__rmod__', '__rmul__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '_formatter_field_name_split', '_formatter_parser', 'capitalize', 'center', 'count','decode', 'encode', 'endswith', 'expandtabs', 'find', 'format', 'index','isalnum', 'isalpha', 'isdigit', 'islower', 'isspace', 'istitle','isupper', 'join', 'ljust', 'lower', 'lstrip', 'partition', 'replace','rfind', 'rindex', 'rjust', 'rpartition', 'rsplit', 'rstrip', 'split','splitlines', 'startswith', 'strip', 'swapcase', 'title', 'translate','upper', 'zfill']>>> Enter the following HTML data into a new text file and save it as sample_page.html. We'll use this as example input to our updated rot13.py. import sysimport stringCHAR_MAP = dict(zip( string.ascii_lowercase, string.ascii_lowercase[13:26] + string.ascii_lowercase[0:13] ))def rotate13_letter(letter): """ Return the 13-char rotation of a letter. """ do_upper = False if letter.isupper(): do_upper = True letter = letter.lower() if letter not in CHAR_MAP: return letter else: letter = CHAR_MAP[letter] if do_upper: letter = letter.upper() return letterif __name__ == '__main__': for line in sys.stdin: for char in line: sys.stdout.write(rotate13_letter(char)) Now, run our rot13.py example and provide our HTML document as standard input data. The exact method used will vary with your operating system. If you've entered the code successfully, you should simply see a new prompt. <html> <head> <title>Hello, World!</title> </head> <body> <p> Hi there, all of you earthlings. </p> <p> Take us to your leader. </p> </body></html> The contents of rot13.html should be as follows. If that's not the case, double back and make sure everything is correct. $ cat sample_page.html | python rot13-b.py > rot13.html$ Open the translated HTML file using your web browser. What just happened? We updated our rot13.py script to read standard input data rather than rely on a command-line option. Doing this provides optimal configurability going forward and lets us feed input of varying length from a collection of different sources. We did this by looping on each line available on the sys.stdin file stream and calling our translation function. We wrote each character returned by that function to the sys.stdout stream. Next, we ran our updated script via the command line, using sample_page.html as input. As expected, the encoded version was printed on our terminal. As you can see, there is a major problem with our output. We should have a proper page title and our content should be broken down into different paragraphs. Remember, structured markup text is sprinkled with tag elements that define its structure and organization. In this example, we not only translated the text content, we also translated the markup tags, rendering them meaningless. A web browser would not be able to display this data properly. We'll need to update our processor code to ignore the tags. We'll do just that in the next section.
Read more
  • 0
  • 0
  • 1338

article-image-microsoft-sql-azure-tools
Packt
07 Jan 2011
6 min read
Save for later

Microsoft SQL Azure Tools

Packt
07 Jan 2011
6 min read
  Microsoft SQL Azure Enterprise Application Development Build enterprise-ready applications and projects with SQL Azure Develop large scale enterprise applications using Microsoft SQL Azure Understand how to use the various third party programs such as DB Artisan, RedGate, ToadSoft etc developed for SQL Azure Master the exhaustive Data migration and Data Synchronization aspects of SQL Azure. Includes SQL Azure projects in incubation and more recent developments including all 2010 updates Appendix         Read more about this book       (For more resources on Microsoft Azure, see here.) SQL Azure is a subset of SQL Server 2008 R2 and, as such, the tools that are used with SQL Server can also be used with SQL Azure, albeit with only some of the features supported. The Microsoft tools can also be divided further between those that are accessed from the Visual Studio series of products and those that are not. However, it appears that with Visual Studio 2010, the SSMS may be largely redundant for most commonly used tasks. Visual Studio related From the very beginning, Visual Studio supported developing data-centric applications working together with the existing version of SQL Server, as well as MS Access databases, and databases from third parties. The various client APIs such as ODBC, OLEDB, and ADO.NET made this happen, not only for direct access to manipulate data on the server, but also for supporting web-facing applications to interact with the databases. There are two versions, of particular interest to SQL Azure, that are specially highlighted as they allow for creating applications consuming data from the SQL Azure Server. Visual Studio 2008 SP1 and Visual Studio 2010 RC were released (April 12, 2010) recently for production. The new key features of the more recent update to SQL Azure are here: http://hodentek.blogspot.com/2010/04/features-galore-for-sql-azure.html. It may be noted that the SQL Azure portal provides the all-important connection strings that is crucial for connecting to SQL Azure using Visual Studio. VS2008 Although you can access and work with SQL Azure using Visual Studio 2008, it does not support establishing a data connection to SQL Azure using the graphic user interface, like you can with an on-site application. This has been remedied in Visual Studio 2010. VS2010 Visual Studio 2010 has a tighter integration with many more features than Visual Studio 2008 SP1. It is earmarked to make the cloud offering more attractive. Of particular interest to SQL Azure is the support it offers in making a connection to SQL Azure through its interactive graphic user interface and the recently introduced feature supporting Data-tier applications. A summary detail of the data tier applications are here: http://hodentek.blogspot.com/2010/02/working-with-data-gotten-lot-easier.html. SQLBulkCopy for Data Transfer In .NET 2.0, the SQLBulkCopy class in the System.Data.SqlClient namespace was introduced. This class makes it easy to move data from one server to another using Visual Studio. An example is described in the next chapter using Visual Studio 2010 RC, but a similar method can be adopted in Visual Studio 2008.SQL Server Business Development Studio (BIDS). The Business Development Studio (BIDS) would fall under both SQL Server 2008 and Visual Studio. The tight integration of Visual Studio with SQL Server was instrumental in the development of BIDS. Starting off to a successful introduction in Visual Studio 2005 more enhancements were added in Visual Studio 2008, both to the Integration Services as well as Reporting Services, two of the main features of BIDS. BIDS is available as a part of the Visual Studio shell when you install the recommended version of SQL Server. Even if you do not have Visual Studio installed, you would get a part of Visual Studio that is needed for developing business intelligence-related integration services as well as reporting services applications. SQL Server Integration Services Microsoft SQL Server Integration Services (SSIS) is a comprehensive data integration service that superseded the Data Transformation Services. Through its connection managers it can establish connections to a variety of data sources that includes SQL Azure. Many of the data intensive tasks from onsite to SQL Azure can be carried out in SSIS. SQL Server Reporting Services SQL Server Reporting Services (SSRS) is a comprehensive reporting package that consists of a Report Server tightly integrated with SQL Server 2008 R2 and a webbased frontend, client software - the Report Manager. SSRS can spin-off reports from data stored on SQL Azure through its powerful data binding interface. Entity Framework Provider Like ODBC, OLE DB, and ADO.NET data providers Entity Framework also features an Entity Framework Provider although, it does not connect to SQL Azure like the others. Using Entity Framework Provider you can create data services for a SQL Azure database. .NET client applications can access these services. In order to work with Entity Framework Provider you need to install Windows Azure SDK (there are several versions of these), which provides appropriate templates. Presently, the Entity Framework Provider cannot create the needed files for a database on SQL Azure. A workaround is adopted. SQL Server related The tools described in this section can directly access SQL Server 2008 R2. The Export/ Import Wizard cannot only access SQL Servers but also products from other vendors to which a connection can be established. Scripting support, BCP, and SSRS are all effective when the database server is installed and are part of the SQL Server 2008 R2 installation. SQL Server Integration Services is tightly integrated with SQL Server 2008 R2, which can store packages created using SSIS. Data access technologies and self-service business integration technologies are developing rapidly and will impact on cloud-based applications including solutions using SQL Azure. SQL Server Management Studio SQL Server Management Studio (SSMS) has been the work horse with the SQL Servers from the very beginning. SSMS also supports working with SQL Azure Server. SQL Server 2008 did not fully support SQL Azure, except it did enable connection to SQL Azure. This was improved in SQL Server 2008 R2 November-CTP and the versions to appear later. SSMS also provides the SQL Azure template, which includes most of the commands you will be using in SQL Azure. Import/Export Wizard The Import/Export Wizard has been present in SQL Servers even from earlier versions to create DTS packages of a simple nature. It could be started from the command line using DTSWiz.exe or DTSWizard.exe. In the same folder, you can access all dts-related files. You can double-click Import and Export Data (32-bit) in Start | All Programs | Microsoft SQL Server 2008 R2. You may also run the DTSWizard.exe from a DOS prompt to launch the wizard. The Import/Export wizard may be able to connect to SQL Azure using any of the following: ODBC DSN SQL Server Native Client 10.0 .NET Data Source Provider for SqlServer The Import/Export wizard can connect to the SQL Azure server using ODBC DSN. Although connection was possible, the export or import was not possible in the CTP version due to some unsupported stored procedure. Import/Export works with the .NET Framework Data Provider, but requires some tweaking.  
Read more
  • 0
  • 0
  • 1525
article-image-granting-access-mysql-python
Packt
28 Dec 2010
10 min read
Save for later

Granting Access in MySQL for Python

Packt
28 Dec 2010
10 min read
MySQL for Python Integrate the flexibility of Python and the power of MySQL to boost the productivity of your Python applications Implement the outstanding features of Python's MySQL library to their full potential See how to make MySQL take the processing burden from your programs Learn how to employ Python with MySQL to power your websites and desktop applications Apply your knowledge of MySQL and Python to real-world problems instead of hypothetical scenarios A manual packed with step-by-step exercises to integrate your Python applications with the MySQL database server Introduction As with creating a user, granting access can be done by modifying the mysql tables directly. However, this method is error-prone and dangerous to the stability of the system and is, therefore, not recommended. Important dynamics of GRANTing access Where CREATE USER causes MySQL to add a user account, it does not specify that user's privileges. In order to grant a user privileges, the account of the user granting the privileges must meet two conditions: Be able to exercise those privileges in their account Have the GRANT OPTION privilege on their account Therefore, it is not just users who have a particular privilege or only users with the GRANT OPTION privilege who can authorize a particular privilege for a user, but only users who meet both requirements. Further, privileges that are granted do not take effect until the user's first login after the command is issued. Therefore, if the user is logged into the server at the time you grant access, the changes will not take effect immediately. The GRANT statement in MySQL The syntax of a GRANT statement is as follows: GRANT <privileges> ON <database>.<table> TO '<userid>'@'<hostname>'; Proceeding from the end of the statement, the userid and hostname follow the same pattern as with the CREATE USER statement. Therefore, if a user is created with a hostname specified as localhost and you grant access to that user with a hostname of '%', they will encounter a 1044 error stating access is denied. The database and table values must be specifi ed individually or collectively. This allows us to customize access to individual tables as necessary. For example, to specify access to the city table of the world database, we would use world.city. In many instances, however, you are likely to grant the same access to a user for all tables of a database. To do this, we use the universal quantifi er ('*'). So to specify all tables in the world database, we would use world.*. We can apply the asterisk to the database field as well. To specify all databases and all tables, we can use *.*. MySQL also recognizes the shorthand * for this. Finally, the privileges can be singular or a series of comma-separated values. If, for example, you want a user to only be able to read from a database, you would grant them only the SELECT privilege. For many users and applications, reading and writing is necessary but no ability to modify the database structure is warranted. In such cases, we can grant the user account both SELECT and INSERT privileges with SELECT, INSERT. To learn which privileges have been granted to the user account you are using, use the statement SHOW GRANTS FOR &ltuser>@hostname>;. With this in mind, if we wanted to grant a user tempo all access to all tables in the music database but only when accessing the server locally, we would use this statement: GRANT ALL PRIVILEGES ON music.* TO 'tempo'@'localhost'; Similarly, if we wanted to restrict access to reading and writing when logging in remotely, we would change the above statement to read: GRANT SELECT,INSERT ON music.* TO 'tempo'@'%'; If we wanted user conductor to have complete access to everything when logged in locally, we would use: GRANT ALL PRIVILEGES ON * TO 'conductor'@'localhost'; Building on the second example statement, we can further specify the exact privileges we want on the columns of a table by including the column numbers in parentheses after each privilege. Hence, if we want tempo to be able to read from columns 3 and 4 but only write to column 4 of the sheets table in the music database, we would use this command: GRANT SELECT (col3,col4),INSERT (col4) ON music.sheets TO 'tempo'@'%'; Note that specifying columnar privileges is only available when specifying a single database table—use of the asterisk as a universal quantifi er is not allowed. Further, this syntax is allowed only for three types of privileges: SELECT, INSERT, and UPDATE. A list of privileges that are available through MySQL is reflected in the following table: MySQL does not support the standard SQL UNDER privilege and does not support the use of TRIGGER until MySQL 5.1.6. More information on MySQL privileges can be found at http://dev.mysql.com/doc/refman/5.5/en/privileges-provided.html Using REQUIREments of access Using GRANT with a REQUIRE clause causes MySQL to use SSL encryption. The standard used by MySQL for SSL is the X.509 standard of the International Telecommunication Union's (ITU) Standardization Sector (ITU-T). It is a commonly used public-key encryption standard for single sign-on systems. Parts of the standard are no longer in force. You can read about the parts which still apply on the ITU website at http://www.itu.int/rec/T-REC-X.509/en The REQUIRE clause takes the following arguments with their respective meanings and follows the format of their respective examples: NONE: The user account has no requirement for an SSL connection. This is the default. GRANT SELECT (col3,col4),INSERT (col4) ON music.sheets TO 'tempo'@'%'; SSL: The client must use an SSL-encrypted connection to log in. In most MySQL clients, this is satisfied by using the --ssl-ca option at the time of login. Specifying the key or certifi cate is optional. GRANT SELECT (col3,col4),INSERT (col4) ON music.sheets TO 'tempo'@'%' REQUIRE SSL; X509: The client must use SSL to login. Further, the certificate must be verifiable with one of the CA vendors. This option further requires the client to use the --ssl-ca option as well as specifying the key and certificate using --ssl-key and --ssl-cert, respectively. GRANT SELECT (col3,col4),INSERT (col4) ON music.sheets TO 'tempo'@'%' REQUIRE X509; CIPHER: Specifies the type and order of ciphers to be used. GRANT SELECT (col3,col4),INSERT (col4) ON music.sheets TO 'tempo'@'%' REQUIRE CIPHER 'RSA-EDH-CBC3-DES-SHA'; ISSUER: Specifies the issuer from whom the certificate used by the client is to come. The user will not be able to login without a certificate from that issuer. GRANT SELECT (col3,col4),INSERT (col4) ON music.sheets TO 'tempo'@'%' REQUIRE ISSUER 'C=ZA, ST=Western Cape, L=Cape Town, O=Thawte Consulting cc, OU=Certification Services Division,CN=Thawte Server CA/emailAddress=server-certs@thawte. com'; SUBJECT: Specifies the subject contained in the certificate that is valid for that user. The use of a certificate containing any other subject is disallowed. GRANT SELECT (col3,col4),INSERT (col4) ON music.sheets TO 'tempo'@'%' REQUIRE SUBJECT 'C=US, ST=California, L=Pasadena, O=Indiana Grones, OU=Raiders, CN=www.lostarks.com/ [email protected]'; Using a WITH clause MySQL's WITH clause is helpful in limiting the resources assigned to a user. WITH takes the following options: GRANT OPTION: Allows the user to provide other users of any privilege that they have been granted MAX_QUERIES_PER_HOUR: Caps the number of queries that the account is allowed to request in one hour MAX_UPDATES_PER_HOUR: Limits how frequently the user is allowed to issue UPDATE statements to the database MAX_CONNECTIONS_PER_HOUR: Limits the number of logins that a user is allowed to make in one hour MAX_USER_CONNECTIONS: Caps the number of simultaneous connections that the user can make at one time It is important to note that the GRANT OPTION argument to WITH has a timeless aspect. It does not statically apply to the privileges that the user has just at the time of issuance, but if left in effect, applies to any options the user has at any point in time. So, if the user is granted the GRANT OPTION for a temporary period, but the option is never removed, then the user grows in responsibilities and privileges, that user can grant those privileges to any other user. Therefore, one must remove the GRANT OPTION when it is not longer appropriate. Note also that if a user with access to a particular MySQL database has the ALTER privilege and is then granted the GRANT OPTION privilege, that user can then grant ALTER privileges to a user who has access to the mysql database, thus circumventing the administrative privileges otherwise needed. The WITH clause follows all other options given in a GRANT statement. So, to grant user tempo the GRANT OPTION, we would use the following statement: GRANT SELECT (col3,col4),INSERT (col4) ON music.sheets TO 'tempo'@'%' WITH GRANT OPTION; If we want to limit the number of queries that the user can have in one hour to five, as well, we simply add to the argument of the single WITH statement. We do not need to use WITH a second time. GRANT SELECT,INSERT ON music.sheets TO 'tempo'@'%' WITH GRANT OPTION MAX_QUERIES_PER_HOUR 5; More information on the many uses of WITH can be found at http://dev.mysql.com/doc/refman/5.1/en/grant.html Granting access in Python Using MySQLdb to enable user privileges is not more difficult than doing so in MySQL itself. As with creating and dropping users, we simply need to form the statement and pass it to MySQL through the appropriate cursor. As with the native interface to MySQL, we only have as much authority in Python as our login allows. Therefore, if the credentials with which a cursor is created has not been given the GRANT option, an error will be thrown by MySQL and MySQLdb, subsequently. Assuming that user skipper has the GRANT option as well as the other necessary privileges, we can use the following code to create a new user, set that user's password, and grant that user privileges: #!/usr/bin/env python import MySQLdb host = 'localhost' user = 'skipper' passwd = 'secret' mydb = MySQLdb.connect(host, user, passwd) cursor = mydb.cursor() try: mkuser = 'symphony' creation = "CREATE USER %s@'%s'" %(mkuser, host) results = cursor.execute(creation) print "User creation returned", results mkpass = 'n0n3wp4ss' setpass = "SET PASSWORD FOR '%s'@'%s' = PASSWORD('%s')" %(mkuser, host, mkpass) results = cursor.execute(setpass) print "Setting of password returned", results granting = "GRANT ALL ON *.* TO '%s'@'%s'" %(mkuser, host) results = cursor.execute(granting) print "Granting of privileges returned", results except MySQLdb.Error, e: print e If there is an error anywhere along the way, it is printed to screen. Otherwise, the several print statements are executed. As long as they all return 0, each step was successful.
Read more
  • 0
  • 0
  • 5824

article-image-getting-and-running-mysql-python
Packt
24 Dec 2010
8 min read
Save for later

Getting Up and Running with MySQL for Python

Packt
24 Dec 2010
8 min read
  MySQL for Python Integrate the flexibility of Python and the power of MySQL to boost the productivity of your Python applications Implement the outstanding features of Python's MySQL library to their full potential See how to make MySQL take the processing burden from your programs Learn how to employ Python with MySQL to power your websites and desktop applications Apply your knowledge of MySQL and Python to real-world problems instead of hypothetical scenarios A manual packed with step-by-step exercises to integrate your Python applications with the MySQL database server Getting MySQL for Python How you get MySQL for Python depends on your operating system and the level of authorization you have on it. In the following subsections, we walk through the common operating systems and see how to get MySQL for Python on each. Using a package manager (only on Linux) Package managers are used regularly on Linux, but none come by default with Macintosh and Windows installations. So users of those systems can skip this section. A package manager takes care of downloading, unpacking, installing, and configuring new software for you. In order to use one to install software on your Linux installation, you will need administrative privileges. Administrative privileges on a Linux system can be obtained legitimately in one of the following three ways: Log into the system as the root user (not recommended) Switch user to the root user using su Use sudo to execute a single command as the root user The first two require knowledge of the root user's password. Logging into a system directly as the root user is not recommended due to the fact that there is no indication in the system logs as to who used the root account. Logging in as a normal user and then switching to root using su is better because it keeps an account of who did what on the machine and when. Either way, if you access the root account, you must be very careful because small mistakes can have major consequences. Unlike other operating systems, Linux assumes that you know what you are doing if you access the root account and will not stop you from going so far as deleting every file on the hard drive. Unless you are familiar with Linux system administration, it is far better, safer, and more secure to prefix the sudo command to the package manager call. This will give you the benefit of restricting use of administrator-level authority to a single command. The chances of catastrophic mistakes are therefore mitigated to a great degree. More information on any of these commands is available by prefacing either man or info before any of the preceding commands (su, sudo). Which package manager you use depends on which of the two mainstream package management systems your distribution uses. Users of RedHat or Fedora, SUSE, or Mandriva will use the RPM Package Manager (RPM) system. Users of Debian, Ubuntu, and other Debian-derivatives will use the apt suite of tools available for Debian installations. Each package is discussed in the following: Using RPMs and yum If you use SUSE, RedHat, or Fedora, the operating system comes with the yum package manager . You can see if MySQLdb is known to the system by running a search (here using sudo): sudo yum search mysqldb If yum returns a hit, you can then install MySQL for Python with the following command: sudo yum install mysqldb Using RPMs and urpm If you use Mandriva, you will need to use the urpm package manager in a similar fashion. To search use urpmq: sudo urpmq mysqldb And to install use urpmi: sudo urpmi mysqldb Using apt tools on Debian-like systems Whether you run a version of Ubuntu, Xandros, or Debian, you will have access to aptitude, the default Debian package manager. Using sudo we can search for MySQLdb in the apt sources using the following command: sudo aptitude search mysqldb On most Debian-based distributions, MySQL for Python is listed as python-mysqldb. Once you have found how apt references MySQL for Python, you can install it using the following code: sudo aptitude install python-mysqldb Using a package manager automates the entire process so you can move to the section Importing MySQL for Python. Using an installer for Windows Windows users will need to use the older 1.2.2 version of MySQL for Python. Using a web browser, go to the following link: http://sourceforge.net/projects/mysql-python/files/ This page offers a listing of all available files for all platforms. At the end of the file listing, find mysql-python and click on it. The listing will unfold to show folders containing versions of MySQL for Python back to 0.9.1. The version we want is 1.2.2. Windows binaries do not currently exist for the 1.2.3 version of MySQL for Python. To get them, you would need to install a C compiler on your Windows installation and compile the binary from source. Click on 1.2.2 and unfold the file listing. As you will see, the Windows binaries are differentiated by Python version—both 2.4 and 2.5 are supported. Choose the one that matches your Python installation and download it. Note that all available binaries are for 32-bit Windows installations, not 64-bit. After downloading the binary, installation is a simple matter of double-clicking the installation EXE file and following the dialogue. Once the installation is complete, the module is ready for use. So go to the section Importing MySQL for Python. Using an egg file One of the easiest ways to obtain MySQL for Python is as an egg file, and it is best to use one of those files if you can. Several advantages can be gained from working with egg files such as: They can include metadata about the package, including its dependencies They allow for the use of egg-aware software, a helpful level of abstraction Eggs can, technically, be placed on the Python executable path and used without unpacking They save the user from installing packages for which they do not have the appropriate version of software They are so portable that they can be used to extend the functionality of third-party applications Installing egg handling software One of the best known egg utilities—Easy Install, is available from the PEAK Developers' Center at http://peak.telecommunity.com/DevCenter/EasyInstall. How you install it depends on your operating system and whether you have package management software available. In the following section, we look at several ways to install Easy Install on the most common systems. Using a package manager (Linux) On Ubuntu you can try the following to install the easy_install tool (if not available already): shell> sudo aptitude install python-setuptools On RedHat or CentOS you can try using the yum package manager: shell> sudo yum install python-setuptools On Mandriva use urpmi: shell> sudo urpmi python-setuptools You must have administrator privileges to do the installations just mentioned. Without a package manager (Mac, Linux) If you do not have access to a Linux package manager, but nonetheless have a Unix variant as your operating system (for example, Mac OS X), you can install Python's setuptools manually. Go to: http://pypi.python.org/pypi/setuptools#files Download the relevant egg file for your Python version. When the file is downloaded, open a terminal and change to the download directory. From there you can run the egg file as a shell script. For Python 2.5, the command would look like this: sh setuptools-0.6c11-py2.5.egg This will install several files, but the most important one for our purposes is easy_install, usually located in /usr/bin. On Microsoft Windows On Windows, one can download the setuptools suite from the following URL: http://pypi.python.org/pypi/setuptools#files From the list located there, select the most appropriate Windows executable file. Once the download is completed, double-click the installation file and proceed through the dialogue. The installation process will set up several programs, but the one important for our purposes is easy_install.exe. Where this is located will differ by installation and may require using the search function from the Start Menu. On 64-bit Windows, for example, it may be in the Program Files (x86) directory. If in doubt, do a search. On Windows XP with Python 2.5, it is located here: C:Python25Scriptseasy_install.exe Note that you may need administrator privileges to perform this installation. Otherwise, you will need to install the software for your own use. Depending on the setup of your system, this may not always work. Installing software on Windows for your own use requires the following steps: Copy the setuptools installation file to your Desktop. Right-click on it and choose the runas option. Enter the name of the user who has enough rights to install it (presumably yourself) After the software has been installed, ensure that you know the location of the easy_install.exe file. You will need it to install MySQL for Python.
Read more
  • 0
  • 0
  • 2457