Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

How-To Tutorials - Data

1204 Articles
article-image-adonet-entity-framework
Packt
23 Oct 2009
6 min read
Save for later

ADO.NET Entity Framework

Packt
23 Oct 2009
6 min read
Creating an Entity Data Model You can create the ADO.NET Entity Data Model in one of the two ways: Use the ADO.NET Entity Data Model Designer Use the command line Entity Data Model Designer called EdmGen.exe We will first take a look at how we can design an Entity Data Model using the ADO.NET Entity Data Model Designer which is a Visual Studio wizard that is enabled after you install ADO.NET Entity Framework and its tools. It provides a graphical interface that you can use to generate an Entity Data Model. Creating the Payroll Entity Data Model using the ADO.NET Entity Data Model Designer Here are the tables of the 'Payroll' database that we will use to generate the data model: Employee Designation Department Salary ProvidentFund To create an entity data model using the ADO.NET Entity Data Model Designer, follow these simple steps: Open Visual Studio.NET and create a solution for a new web application project as seen below and save with a name. Switch to the Solution Explorer, right click and click on Add New Item as seen in the following screenshot: Next, select ADO.NET Entity Data Model from the list of the templates displayed as shown in the following screenshot:   Name the Entity Data Model PayrollModel and click on Add. Select Generate from database from the Entity Data Model Wizard as shown in the following screenshot: Note that you can also use the Empty model template to create the Entity Data Model yourself. If you select the Empty Data Model template and click on next, the following screen appears: As you can see from the above figure, you can use this template to create the Entity Data Model yourself. You can create the Entity Types and their relationships manually by dragging items from the toolbox. We will not use this template in our discussion here. So, let's get to the next step. Click on Next in the Entity Data Model Wizard window shown earlier. The modal dialog box will now appear and prompts you to choose your connection as shown in the following figure: Click on New Connection Now you will need to specify the connection properties and parameters as shown in the following figure: We will use a dot to specify the database server name. This implies that we will be using the database server of the localhost, which is the current system in use. After you specify the necessary user name, password, and the server name, you can test your connection using the Test Connection button. When you do so, the message Test connection succeeded gets displayed in the message box as shown in the previous figure. When you click on OK on the Test connection dialog box, the following screen appears: <connectionStrings> <add name="PayrollEntities" connectionString="metadata=res:// *; provider=System.Data.SqlClient;provider connection string=&quot; Data Source=.;Initial Catalog=Payroll;User ID=sa;Password=joydip1@3; MultipleActiveResultSets=True&quot;" providerName="System.Data.EntityClient" /> </connectionStrings> Note the Entity Connection String generated automatically. This connection string will be saved in the ConnectionStrings section of your application's web.config file. This is how it will look like: When you click on Next in the previous figure, the following screen appears: Expand the Tables node and specify the database objects that you require in the Entity Data Model to be generated as shown in the following figure: Click on Finish to generate the Entity Data Model. Here is the output displayed in the Output Window while the Entity Data Model is being generated: Your Entity Data Model has been generated and saved in a file named PayrollModel.edmx. We are done creating our first Entity Data Model using the ADO.NET Entity Data Model Designer tool. When you open the Payroll Entity Data Model that we just created in the designer view, it will appear as shown in the following figure: Note how the Entity Types in the above model are related to one another. These relationships have been generated automatically by the Entity Data Model Designer based on the relationships between the tables of the Payroll database. In the next section, we will learn how we can create an Entity Data Model using the EdmGen.exe command line tool. Creating the Payroll Data Model Using the EdmGen Tool We will now take a look at how to create a data model using the Entity Data Model generation tool called EdmGen. The EdmGen.exe command line tool can be used to do one or more of the following: Generate the .cdsl, .msl, and .ssdl files as part of the Entity Data Model Generate object classes from a .csdl file Validate an Entity Data Model The EdmGen.exe command line tool generates the Entity Data Model as a set of three files: .csdl, .msl, and .ssdl. If you have used the ADO.NET Entity Data Model Designer to generate your Entity Data Model, the .edmx file generated will contain the CSDL, MSL, and the SSDL sections. You will have a single .edmx file that bundles all of these sections into it. On the other hand, if you use the EdmGen.exe tool to generate the Entity Data Model, you would find three distinctly separate files with .csdl, .msl or .ssdl extensions. Here is a list of the major options of the EdmGen.exe command line tool: Option Description /help Use this option to display help on all the possible options of this tool. The short form is /? /language:CSharp Use this option to generate code using C# language /language:VB Use this option to generate code using VB language /provider:<string> Use this option to specify the name of the ADO.NET data provider that you would like to use. /connectionstring: <connection string> Use this option to specify the connection string to be used to connect to the database /namespace:<string> Use this option to specify the name of the namespace /mode:FullGeneration Use this option to generate your CSDL, MSL, and SSDL objects from the database schema /mode:EntityClassGeneration Use this option to generate your entity classes from a given CSDL file /mode:FromSsdlGeneration Use this option to generate MSL, CSDL, and Entity Classes from a given SSDL file /mode:ValidateArtifacts Use this option to validate the CSDL, SSDL, and MSL files /mode:ViewGeneration Use this option to generate mapping views from the CSDL, SSDL, and MSL files  
Read more
  • 0
  • 0
  • 2137

article-image-comparing-cursor-and-set-approaches-processing-relational-data
Packt
23 Oct 2009
5 min read
Save for later

Comparing Cursor and Set Approaches in Processing Relational Data

Packt
23 Oct 2009
5 min read
To give you an idea about cursor, the following DECLARE statement creates a cursor (named zero_bal_crs), which gives you access to the rows of a payment table, rows that have 100 or less balances. DECLARE zero_bal_crs CURSOR FOR SELECT * FROM payment WHERE payment_bal < 100 You then fetch and process each of the rows sequentially; the process can be, for example, summing up rows by product group and loading the sums into a summary table. You develop the process outside of the cursor, for example, in a stored procedure. (You’ll see in the examples that the cursor-based process is procedural). Instead of processing row-by-row sequentially, you can process relational data set-by-set, without a cursor. For example, to sum all payment rows with 100 or less balances and load it into a table (named payment_100orless_bal in the following SQL statement), you can use the following SQL statement. This single SQL statement completely processes the rows that meet the condition, all at once, as a set. INSERT INTO payment_100orless_bal SELECT SUM(payment_bal) FROM payment WHERE payment_bal < 100 The following three examples, which are among those I most frequently encountered, further compare cursor-based processing with set processing. Example 1: Sequential Loop cf. One SQL Statement Our process in the first example is to summarize sales transactions by product. Listing 1 shows the DDLs for creating the sales transaction table and sales product table that stores the summary. Listing 1.1: DDLs of the Example 1 Tables CREATE TABLE sales_transactions( product_code INT, sales_amount DEC(8,2), discount INT);CREATE TABLE sales_product(product_code INT, sales_amount DEC(8,2)); Listing 1.2 shows a cursor-based stored procedure that implements the process. In this example, what the cursor (sales_crs) does is simply putting all rows in ascending order by their product codes. As you might have expected, this stored procedure applies loop with if-then-else programming construct to process the data row-by-row. Listing 1.2 Cursor Procedural solution DELIMITER $$DROP PROCEDURE IF EXISTS ex1_cursor $$USE sales $$CREATE PROCEDURE ex1_cursor()BEGINDECLARE p INT DEFAULT 0;DECLARE s DECIMAL(8,2) DEFAULT 0;DECLARE d INT DEFAULT 0;DECLARE done INT DEFAULT 0;DECLARE first_row INT DEFAULT 0;DECLARE px INT DEFAULT 0;DECLARE sx DECIMAL(8,2) DEFAULT 0;DECLARE sales_crs CURSOR FOR SELECT * FROM sales_transactions ORDER BY product_code;DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = 1;OPEN sales_crs; REPEAT FETCH sales_crs INTO p, s, d;IF first_row = 0 THEN SET first_row = 1; SET px = p; SET sx = s * (100 - d)/100; ELSEIF NOT done THEN IF p <> px THEN INSERT INTO sales_product VALUES(px, sx); SET sx = s * (100 - d)/100; SET px = p; ELSE SET sx = sx + s * (100 - d)/100; END IF; ELSE INSERT INTO sales_product VALUES(px, sx); END IF; UNTIL done END REPEAT;CLOSE sales_crs;END $$DELIMITER ; The stored procedure in Listing 1.3 implements a set, processing to accomplish the same purpose as its foregoing cursor-based processing. You can see that this stored procedure is simpler than its cursor-based counterpart. Listing 1.3 Set Operation solution DELIMITER //USE sales //DROP PROCEDURE IF EXISTS ex1_sql //CREATE PROCEDURE ex1_sql ()BEGIN INSERT INTO sales_product SELECT product_code , SUM(sales_amount * (100 - discount)/100) FROM sales_transactions GROUP BY product_code;END //DELIMITER ; Example 2: Nested Cursor cf. Join Our 2nd example is to consolidate order by customer. While example 1 has one table, example 2 has two: order and item. The DDLs of the two tables are shown in Listing 2. To process two tables, our cursor implementation uses a nested loop (Listing 2.2) which makes it even more complex than example 1. Listing 2.1: DDL’s of the Example 2 Tables CREATE TABLE order ( , order_number INT , order_date DATE , customer_number INT);CREATE TABLE item ( , order_number INT , product_code INT , quantity INT , unit_price DEC(8,2)); Listing 2.2: Nested Cursor DELIMITER $$DROP PROCEDURE IF EXISTS ex2_cursor $$CREATE PROCEDURE ex2_cursor()BEGINDECLARE so INT;DECLARE sc INT;DECLARE io INT;DECLARE iq INT;DECLARE iu DEC(10,2);DECLARE done1 VARCHAR(5) DEFAULT 'START' ;DECLARE done2 VARCHAR(5) DEFAULT 'START' ;DECLARE sales_crs CURSOR FOR SELECT customer_number, order_number FROM sales_order ORDER BY customer_number, order_number;DECLARE order_crs CURSOR FOR SELECT order_number, quantity, unit_price FROM item WHERE order_number = so ORDER BY order_number;DECLARE CONTINUE HANDLER FOR NOT FOUND SET done1 = 'END';OPEN sales_crs; WHILE done1 <> 'END' DO FETCH sales_crs INTO sc, so; IF done1 <> 'END' THEN/* inner cursor */ OPEN oorder_crs; SET done2 = done1; WHILE done1 <> 'END' DO FETCH order_crs INTO io, iq, iu; IF done1 <> 'END' THEN INSERT INTO customer_order VALUES (sc, iq * iu); END IF;END WHILE; CLOSE order_crs; SET done1 = done2; END IF; END WHILE;CLOSE sales_crs;END $$DELIMITER ; Listing 2.3 is the set solution. We apply join in this 2nd example which makes it just a single SQL statement. You can see that this set stored procedure is again simpler than its cursor-based equivalent.    
Read more
  • 0
  • 0
  • 2213

article-image-creating-simple-report-using-birt
Packt
23 Oct 2009
5 min read
Save for later

Creating a Simple Report using BIRT

Packt
23 Oct 2009
5 min read
Setting up a Simple Project The first thing we want to do when setting up our simple report project is to define what the project is going to be, and what our first simple report will be. Our first report will be a simple dump of the employees who work for Classic Cars. So, the first thing we need to do is set up a project. To do this, we will use the Navigator. Make sure you have the BIRT report perspective open. Use the following steps to create our project: Open up the Navigator by single-clicking on the Navigator tab. Right-click anywhere in the white-space in the Navigator. Select New from the menu, and under New select Project. From the Dialog screen, select Business Intelligence and Reporting Tools from the list of folders; expand that view, and select Report Project. Then click on the Next button. For the Project name, enter Class_Cars_BIRT_Reports. You can either leave the Use Default Location checkbox checked, or uncheck it and enter a location on your local drive to store this report project. Now, we have a very simple report project in which to store our BIRT reports starting with the first that we are about to create. Creating a Simple Report Now that we have our first project open, we will look at creating our first report. As mentioned earlier, we will create a basic listing report that will display all the information in the employees table. In order to do this, we will use the following steps: Right-click on the Class_Cars_BIRT_Reports project under the Navigator, and choose New and Report. Make sure the Class_Cars_BIRT_Reports project is highlighted in the new report Dialog, and enter in the name as EmployeeList.rptdesign. I chose this name as it is somewhat descriptive of the purpose of the report, which is to display a list of employees. As a rule of thumb, always try to name your reports after the expected output, such as QuarterlyEarningReport.rptdesign, weeklyPayStub.rptdesign, or accountsPayable.rptdesign. On the next screen is a list of different report templates that we can use. We will select  Simple Listing and then click on the Finish button. Go to the Data Explorer, right-click on Data Sources, and choose New Data Source. From the New Data Source Dialog box, select Classic Models Inc. Sample Database and click on the Next button. On the next screen, it will inform you of the driver information. You can ignore this for now and click Finish. Under the Data Explorer, right-click on Data Sets and choose New Data Set. On the next screen, enter the Data Set Name as dsetEmployees, and make sure that our created Data Source is selected in the list of Data sources. You can click Next when this is finished. On the Query Dialog, enter the following query and click Finish: On the next screen, just click OK. This screen is used to edit information about Data Sets, and we will ignore it for now. Now, from the Outline select Data Sets and expand it to show all of the fields. Drag the EMPLOYEENUMBER element over to the Report Designer, and drop it on the cell with the label of Detail Row. This will be the second row and the first column. You will notice that when you do this, the header row also gets an element placed in it called EMPLOYEENUMBER. This is the Header label. Double- click on this cell and it will become highlighted. We can now edit it. Type in "Employee ID". Drag and drop the LASTNAME, FIRSTNAME, and JOBTITLE to the detail cells to the right of the EMPLOYEENUMBER cell. Now, we want to put the header row in bold. Under the Outline, select the Row element located under Body/Table/Header. This will change the Property Editor. Click on Font, and then click on the Bold button. That's it! We have created our first basic report. To see what this report looks like, under the Report Designer pane, click on the Preview tab. This will allow you to get a good idea of what this report will look like. Alternatively you can actually Run the report and get an idea what this report will look like in the BIRT Report Viewer application, by going up to File/View Report/View Report in Web Viewer. This option is also available by right-click on the report design file under the Navigator, and choosing Report followed by Run. Although it may be a simple report, this exercise demonstrated how a report developer can get through the BIRT environment, and how the different elements of the BIRT perspective work together. Summary For a very simple report design, we utilized all of the major areas of the BIRT perspective. We used the Navigator to create a new report project and a new report design, the Data explorer to create out data connection and Data Set, dragged elements from the Outline to the Report Designer to get the data elements into the right place, and used the Property Editor and Outline cooperatively to bold the text in the table header.
Read more
  • 0
  • 0
  • 3803
Visually different images

article-image-postgresqls-transaction-model
Packt
23 Oct 2009
7 min read
Save for later

PostgreSQL's Transaction Model

Packt
23 Oct 2009
7 min read
On Databases Databases come in many forms. The simplest definition of a database is any system of storing, organizing, and retrieving data. With this definition, things like memory, hard drives, file systems, files on those file systems (stored in plain text, tab-delimited, XML, JSON, or even BDB formats), and even applications like MySQL, PostgreSQL, and Oracle are considered databases. Databases allow users to: Store Data Organize Data Retrieve Data It is important to keep a broad perspective on what data and databases really are so that you can always choose the best solution for your particular problem. The SQL databases (MySQL, PostgreSQL, Oracle, and others) are remarkable because of the flexibility and performance they provide. In my work, I look to them first when developing an application, with an eye towards getting the data model right before optimization. Once the application is solid, and once I fully understand what parts of the data system are too slow or fast enough, then I can start building my own database on top of the file system or other existing technologies that will give me the kind of performance I need. PostgreSQL: Free, BSD -licensed popular database. http://postgresql.org/ MySQL: Free, GPL-licensed popular database. http://mysql.org Oracle: Commercial industrial database. http://oracle.com SQL Server: Microsoft's commercial database. http://www.microsoft.com/SQL/default.mspx Among the SQL databases, which one is best? There are many criteria I use to evaluate SQL databases, and the one I pay attention to most is how they comply (if at all) with the ACID model. And given the technical merits of the various SQL databases, I consistently choose PostgreSQL above all other SQL databases when given a choice. Allow me to explain why. The ACID Model ACID is an acronym, standing for the four words Atomicity, Consistency, Isolation, and Durability. These are fancy words for some very basic and essential concepts. Atomicity means that you either do all of the changes you want, or none of them, without leaving the database in some weird in-between state. When you take into account catastrophes like power failures or corruption, atomicity isn't as simple as it first seems. Consistency means that any state of the database will be internally consistent with the rules that constrain the data. That is, if you have a table with a primary key, then that table will not contain any violations of the primary key constraints after any transaction. Isolation means that you can be modifying many different parts of the database at the same time without affecting each other. (As a higher feature, there is Serialization, which requires that transactions occur one after the other, or at least the results of transactions.) Durability means that once a transaction completes, it is never lost, ever. Atomicity: All or nothing Consistency: Rules kept Isolation: No partials seen Durability: Doesn't disappear ACID compliance isn't rocket science, but it isn't trivial either. These requirements form a minimum standard absolutely necessary to provide a database for a reasonable application. That is, if you can't guarantee these things, then the users of your application are going to be frustrated since they assume, naturally, that the ACID model is followed. And if the users of the application get frustrated, then the developers of the application will get frustrated as they try to comply with the user's expectations. A lot of frustration can be avoided if the database simply complies with the principles of the ACID model. If the database gets it right, then the rest of the application will have no problem getting it right as well. Our users will be happy since their expectations of ACID compliance will be met. Remember: Users expect ACID! What Violating the ACID Model Looks Like To consider the importance of the ACID model, let's examine, briefly, what happens when the model is violated. When Atomicity isn't adhered to, users will see their data partially committed. For instance, they might find their online profile only partially modified, or their bank transfer partially transferred. This is, of course, devastating to the unwary user. When Consistency is violated, the rules that the data should follow aren't adhered to. Perhaps the number of friends shown doesn't match the friends they actually have in a social networking application. Or perhaps they see their bank balance doesn't match what the numbers add up to. Or worse, perhaps your order system is counting orders that don't even exist and not counting orders that do. When Isolation isn't guaranteed, they will either have to use a system where only one person can change something at a time, locking out all others, or they will see inconsistencies throughout the world of data, inconsistencies resulting from transactions that are in progress elsewhere. This will make the data unreliable just like violating Atomicity or Consistency. A bank user, for instance, will believe their transfer of funds was successful when in reality their money was simultaneously being withdrawn by another transaction. When Durability is lost, then users will never know if their transaction really went through, and won't mysteriously disappear down the road with all the trouble that entails. I am sure we have all had experiences dealing with data systems that didn't follow the ACID model. I remember the days when you had to save your files frequently, and even then you still weren't ensured that all of your data would be properly saved. I also recall applications that would make partial changes, or incomplete changes, and expose these inconsistent states to the user. In today's world, writing applications with faults like the above is simply inexcusable. There are too many tools out there that are readily available that make writing ACID compliant systems easy. One of those tools, probably the most popular of all, is the SQL database. Satisfying ACID with Transactions The principle way that databases comply with ACID requirements is through the concept of transactions. Ideally, each transaction would occur in an instant, updating the database according to the state of the database at that moment. In reality, this isn't possible. It takes time to accumulate the data and apply the changes. Typical transaction SQL commands: BEGIN: Start a new transaction COMMIT: Commit the transaction ROLLBACK: Roll back the transaction in progress Since multiple sessions can each be creating and applying a transaction simultaneously, special precautions have to be taken to ensure that the data that each transaction “sees” is consistent, and that the effects of each transaction appear all together or not at all. Special care is also taken to ensure that when a transaction is committed, the database will be put in a state where catastrophic events will not leave the transaction partially committed. Contrary to popular belief, there are a variety of ways that databases support transactions. It is well worth the time to read and understand PostgreSQL's two levels of transaction isolation and the four possible isolation levels in Section 12.2 of the PostgreSQL documentation. Note that some of the inferior levels of transaction isolation violate some extreme cases of ACID compliance for the sake of performance. These edge cases can be properly handled with appropriate use of row-locking techniques. Row-locking is an issue beyond this article. Keep in mind that the levels of transaction isolation are only what appear to users of the database. Inside the database, there is a remarkable variety of methods on actually implementing transactions. Consider that while you are in a transaction, making changes to the database, every other transaction has to see one version of the database while you see another. In effect, you have to have copies of some of the data lying around somewhere. Queries to that data have to know which version of the data to retrieve the copy, the original, or the modified version (and which modified version?) Changes to the data have to go somewhere the original, a copy, or some modified version (again, which?) Answering these questions leads to the various implementations of transactions in ACID compliant databases. For the purposes of this article, I will examine only two: Oracle's and PostgreSQL's implementations. If you are only familiar with Oracle, then hopefully you will learn something new and fascinating as you investigate PostgreSQL's method.
Read more
  • 0
  • 0
  • 2882

article-image-aspnet-repeater-control
Packt
23 Oct 2009
6 min read
Save for later

The ASP.NET Repeater Control

Packt
23 Oct 2009
6 min read
The Repeater control in ASP.NET is a data-bound container control that can be used to automate the display of a collection of repeated list items. These items can be bound to either of the following data sources: Database Table XML File In a Repeater control, the data is rendered as DataItems that are defined using one or more templates. You can even use HTML tags such as <li>, <ul>, or <div> if required. Similar to the DataGrid, DataList, or GridView controls, the Repeater control has a DataSource property that is used to set the DataSource of this control to any ICollection, IEnumerable, or IListSource instance. Once this is set, the data from one of these types of data sources can be easily bound to the Repeater control using itsDataBind() method. However, the Repeater control by itself does not support paging or editing of data. The Repeater control is light weight and does not contain so many features as the DataGrid contains. However, it enables you to place HTML code in its templates. It is great in situations where you need to display the data quickly and format the data to be displayed easily. Using the Repeater Control The Repeater control is a data-bound control that uses templates to display data. It does not have any built-in support for paging, editing, or sorting of the data that is rendered through one or more of its templates. The Repeater control works by looping through the records in your data source and then repeating the rendering of one of its templates called the ItemTemplate, one that contains the records that the control needs to render. To use this control, drag and drop the control in the design view of the web form onto a web form from the toolbox. Refer to the following screenshot: You can also drag and drop the Repeater control from the toolbox onto the source view directly. This is shown in the following screenshot: For customizing the behavior of this control, you have to use the built-in templates that this control comes with. These templates are actually blocks of HTML code. The Repeater control contains the following five templates: HeaderTemplate ItemTemplate AlternatingItemTemplate SeparatorTemplate FooterTemplate The following screenshot shows how a Repeater control looks when populated with data. Note that the templates (Header, Item, Footer, Alternate and Separator) have all been used. The following code snippet is an example of the order in which the templates of the Repeater control are used. <asp:Repeater id="repEmployee" runat="server"><HeaderTemplate>...</HeaderTemplate><ItemTemplate></ItemTemplate><FooterTemplate>...</FooterTemplate></asp:Repeater> When the Repeater control is bound to a data source, the data from the data source is displayed using the ItemTemplate element and any other optional elements, if used. Note that the contents of the HeaderTemplate and the FooterTemplate are rendered once for each Repeater control. The contents of the ItemTemplate are rendered for each record in the control. You can also use the additional AlternatingItemTemplate element after the ItemTemplate element for specifying the appearance of each alternate record. You can also use the SeparatorTemplate element between each record for specifying the separators for the records. Displaying Data Using the Repeater Control This section discusses how we can display data using the Repeater control. As discussed earlier, the Repeater control uses templates for formatting the data that it displays. The following code snippet displays the code in an .aspx file that contains a Repeater control. Note that we would be making use of templates and that the data would be bound to the control from the code-behind file using the DataManager class. The Repeater control is populated with data in the Page_Load event by reusing the DataManager(). Note how the SeparatorTemplate and the AlternatingItemTemplate have been used in the previous code example. Further, the DataBinder.Eval() method has been used to display the values of the corresponding fields from the data container, (in our case, the DataSet instance) in the Repeater control. The FooterTemplate uses the Total Records variable and substitutes its value to display the total number of records displayed by the control. The following is the output on execution. The Header and the Footer templates of the Repeater control are still rendered even if the data source does not contain any data. If you want to suppress their display, you can use the Visible property of the Repeater control and use it to suppress the display of these templates with a simple logic. Here is how you specify the Visible property of this control in your .aspx file to achieve this_Visible="<%# Repeater1.Items.Count > 0 %>"When you specify the Visible property as shown here, the Repeater is made visible only if there are records in your data source.   Displaying Checkboxes in a Repeater Control Let us now understand how we can display checkboxes in a Repeater Control and retrieve the number of checked items. We will use a Button control and a Label control in our page. When you click on the Button control, the number of checked items in the Repeater Control will be displayed in the Label control. The output on execution is similar to what is shown in the following screenshot: Here is the code that we will use in the .aspx file to display checkboxes in a Repeater control. The data is bound to the Repeater control in the Page_Load event as follows: Note that we have used the Page.IsPostBack to check whether the page has posted back in the Page_Load method. If you don't bind data by checking whether the page has posted back, the Repeater control will be rebound to data once again after a postback and all the checkboxes in your web page will be reset to the unchecked state. The source code for the click event of the Button control that we have used is as follows: When you execute the application, the Repeater control is displayed with records from the employee table. Now you check one or more of the checkboxes and then click on the Button control just beneath the Repeater control as follows: Note that the number of checked records is displayed in the Label Control. Summary This article discussed the Repeater control and how we can use it in ourASP.NET applications. Though this control does not support all the functionalities of other data controls, like DataGrid and GridView, it is still a good choice if you want faster rendering of data as it is light weight, and is very flexible.
Read more
  • 0
  • 0
  • 2653

article-image-creating-view-mysql-query-browser
Packt
23 Oct 2009
2 min read
Save for later

Creating a View with MySQL Query Browser

Packt
23 Oct 2009
2 min read
Please refer to an earlier article by the author to learn how to build queries visually. Creating a View from an Existing Query To create a view from a query, you must have executed the query successfully. To be more precise, the view is created from the latest successfully executed query, not necessarily from the query currently in the Query Area. To further clarify, the following three examples are cases where the view is not created from the current query: Your current query fails, and immediately after you create a view from the query. The view created is not from the failed query. If the failed query is the first query in your MySQL Query Browser session, you can’t create any view. You have just moved forward or backward the query in the Query Area without executing it, and then your current query is not the latest successfully executed. You open a saved query that you have never executed successfully in your active Resultset. Additionally, if you’re changing your Resultset, the view created is from the latest successfully executed query that uses the currently active Resultset to display its output. To make sure your view is from the query you want, select the query, confirm it as written in the Query Area, execute the query, and then, immediately create its view. You create a view from an existing query by selecting Query | Create View from Select from the Menu bar. Type in the name you want to give to the view, and then click Create View. MySQL Query Browser creates the view. When successfully created, you can see the view in the Schemata. You can modify a view by editing it: Right-click the view and select Edit View. You can edit the CREATE view statement by right-clicking it and select Edit View. The CREATE view statement opens in its Script tab. When you finish editing, you can execute the modified view. If successful, the existing view is replaced with the modified one. To replace the view you’re editing with the modified view, change the name of the view before you execute it. If you want to keep the view you’re editing, remove the DROP VIEW statement.
Read more
  • 0
  • 0
  • 4319
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at €14.99/month. Cancel anytime
article-image-visual-mysql-database-design-mysql-workbench
Packt
21 Oct 2009
3 min read
Save for later

Visual MySQL Database Design in MySQL Workbench

Packt
21 Oct 2009
3 min read
MySQL Workbench is a visual database design tool recently released by MySQL AB. The tool is specifically for designing MySQL database. What you build in MySQL Workbench is called physical data model. A physical data model is a data model for a specific RDBMS product; the model in this article will have some MySQL unique specifications. We can generate (forward-engineer) the database objects from its physical model, which in addition to tables and their columns, can also include other objects such as view. MySQL Workbench has many functions and features; this article by Djoni Darmawikarta shows some of them by way of an example. We’ll build a physical data model for an order system where an order can be a sale order or a purchase order; and then, forward-engineer our model into an MySQL database. The physical model of our example in EER diagram will look like in the following MySQL Workbench screenshot. Creating ORDER Schema Let’s first create a schema where we want to store our order physical model. Click the + button (circled in red). Change the new schema’s default name to ORDER. Notice that when you’re typing in the schema name, its tab name on the Physical Schemata also changes accordingly—a nice feature. The order schema is added to the Catalog (I circled the order schema and its objects in red). Close the schema window. Confirm to rename the schema when prompted. Creating Order Tables We’ll now create three tables that model the order: ORDER table and its two subtype tables: SALES_ORDER and PURCHASE_ORDER, in the ORDER schema. First of all, make sure you select the ORDER schema tab, so that the tables we’ll create will be in this schema. We’ll create our tables as EER diagram (EER = Enhanced Entity Relationship). So, double-click the Add Diagram button. Select (click) the Table icon, and then move your mouse onto the EER Diagram canvas and click on the location you want to place the first table. Repeat for the other two tables. You can move around the tables by dragging and dropping. Next, we’ll work on table1, which we’ll do so using the Workbench’s table editor. We start the table editor by right-clicking the table1 and selecting Edit Table. Next, we’ll work on table1, which we’ll do so using the Workbench’s table editor. We start the table editor by right-clicking the table1 and selecting Edit Table. Rename the table by typing in ORDER over table1. We’ll next add its columns, so select the Columns tab. Replace idORDER column name with ORDER_NO. Select INT as the data type from the drop-down list. We’d like this ORDER_NO column to be valued incrementally by MySQL database, so we specify it as AI column (Auto Increment). AI is a specific feature of MySQL database. You can also specify other physical attributes of the table, such as its Collation; as well as other advanced options, such as its trigger and partioning (the Trigger and Partioning tabs). Notice that on the diagram our table1 has changed to ORDER, and it has its first column, ORDER_NO. In the Catalog you can also see the three tables. The black dots on the right of the tables indicate that they’ve been included in an diagram.  
Read more
  • 0
  • 0
  • 9491

article-image-setting-most-popular-journal-articles-your-personalized-community-liferay-portal
Packt
21 Oct 2009
6 min read
Save for later

Setting up the most Popular Journal Articles in your Personalized Community in Liferay Portal

Packt
21 Oct 2009
6 min read
Personal community is a dynamic feature of Liferay portal. By default, the personal community is a portal-wide setting that will affect all of the users. It would be nice to have more features in the personal community such as showing the most popular journal articles. This article by Jonas Yuan will address how to set up the most popular journal articles in you personalized community and view the counter for other assets. In a web site, we will have a lot of journal articles (that is, web content) for a given article type. For example, for the article type Article Content, we will have articles talking about product family. We may want to know how many times the end users read each article. Meanwhile, it would be nice if we could show the most popular articles (for example, TOP 10 articles) for this given article type. As shown in the following screenshot, a journal article My EDI Product I is shown via a portlet Ext Web Content Display. Rating and comments on this article are also exhibited. At the same time, the medium-size image, polls, and related content of this article are listed, too. A view counter of this article is especially displayed under the ratings. Moreover, the most popular articles are exhibited with article title and number of views under related content. All these articles belong to the article type article-content. That is, the article in the current portlet Ext Web Content Display has the most popular articles only for the article type article-content. Of course, you can customize the portlet Web Content Display directly through changing JSP files. For demo purposes, we will implement the view counter in the portlet Ext Web Content Display. Meanwhile, we will implement the mostly popular articles via VM services and article templates. In addition, we will analyze the view counter for other assets such as Image Gallery images, Document Library documents, Wiki articles, Blog entries, Message Boards threads, and so on. Adding a view counter in the Web Content Display portlet First of all, let's add a view counter in the Ext Web Content Display portlet. As the function of view counter for assets (including journal articles) is provided in the model TagsAssetModel of the com.liferay.portlet.tags.model package in the /portal/portal-service/src folder, we could use this feature in this portlet directly. To do so, use the following steps: Create a folder journal_content in the folder /ext/ext-web/docroot/html/portlet/. Copy the JSP file view.jsp in the folder /portal/portal-web/docroot/html/portlet/ to the folder /ext/ext-web/docroot/html/portlet/journal_content and open it. Add the line <%@ page import="com.liferay.portlet.tags.model.TagsAsset" %> after the line <%@ include file="/html/portlet/journal_content/init.jsp" %>, and check the following lines: JournalArticleDisplay articleDisplay = (JournalArticleDisplay) request.getAttribute( WebKeys.JOURNAL_ARTICLE_DISPLAY); if (articleDisplay != null) { TagsAssetLocalServiceUtil.incrementViewCounter( JournalArticle.class.getName(), articleDisplay.getResourcePrimKey());} Then add the following lines after the line <c:if test="<%=enableComments %>"> and save it: <span class="view-count"> <% TagsAsset asset = TagsAssetLocalServiceUtil.getAsset (JournalArticle.class.getName(), articleDisplay.getResourcePrimKey());%> <c:choose> <c:when test="<%= asset.getViewCount() == 1 %>"> <%= asset.getViewCount() %> <liferay-ui:message key="view" />, </c:when> <c:when test="<%= asset.getViewCount() > 1 %>"> <%= asset.getViewCount() %> <liferay-ui:message key="views" />, </c:when> </c:choose></span> The code above shows a way to increase the view counter via the TagsAssetLocalServiceUtil.incrementViewCounter method. This method takes two parameters className and classPK as inputs. For the current journal article, the two parameters are JournalArticle.class.getName() and articleDisplay.getResourcePrimKey(). Then, this code shows a way to display view counted through the TagsAssetLocalServiceUtil.getAsset method. Similarly, this method also takes two parameters, className and classPK, as inputs. This approach would be useful for other assets, as the className parameter could be Image Gallery, Document Library, Wiki, Blogs, Message Boards, Bookmark, and so on. Setting up VM service We can set up the VM service to exhibit the most popular articles. We can also add the getMostPopularArticles method in the custom velocity tool ExtVelocityToolUtil. To do so, first add the following method in the ExtVelocityToolService interface: public List<TagsAsset> getMostPopularArticles(String companyId, String groupId, String type, int limit); And then add an implementation of the getMostPopularArticles method in the ExtVelocityToolServiceImpl class as follows: public List<TagsAsset> getMostPopularArticles(String companyId, String groupId, String type, int limit) { List<TagsAsset> results = Collections.synchronizedList(new ArrayList<TagsAsset>()); DynamicQuery dq0 = DynamicQueryFactoryUtil.forClass( JournalArticle.class, "journalarticle"). setProjection(ProjectionFactoryUtil.property ("resourcePrimKey")).add(PropertyFactoryUtil. forName("journalarticle.companyId"). eqProperty("tagsasset.companyId")). add(PropertyFactoryUtil.forName( "journalarticle.groupId").eqProperty( "tagsasset.groupId")).add(PropertyFactoryUtil. forName("journalarticle.type").eq( "article-content")); DynamicQuery query = DynamicQueryFactoryUtil.forClass( TagsAsset.class, "tagsasset") .add(PropertyFactoryUtil.forName( "tagsasset.classPK").in(dq0)) .addOrder(OrderFactoryUtil.desc( "tagsasset.viewCount")); try{ List<Object> assets = TagsAssetLocalServiceUtil. dynamicQuery(query); int index = 0; for (Object obj: assets) { TagsAsset asset = (TagsAsset) obj; results.add(asset); index ++; if(index == limit) break; } } catch (Exception e){ return results; } return results; } The preceding code shows a way to get the most popular articles by company ID, group ID, article type, and limited articles to be returned. DynamicQuery API allows us to leverage the existing mapping definitions through access to the Hibernate session. For example, DynamicQuery dq0 selects the journal articles by companyID, groupId, and type; DynamicQuery query selects tagsassets by classPK, which exists in DynamicQuery dq0; and tagsassets are ordered by viewCount as well. Finally, add the following method to register the above method in ExtVelocityToolUtil: public List<TagsAsset> getRelatedArticles(String companyId, String groupId, String articleId, int limit){ return _extVelocityToolService.getRelatedArticles(companyId, groupId, articleId, limit);} The code above shows a generic approach to get TOP 10 articles for any article types. Of course, you can extend this approach to find TOP 10 assets. This can include Image Gallery images, Document Library documents, Wiki articles, Blog entries, Message Boards threads, Bookmark entries, slideshow, videos, games, video queue, video list, playlist, and so on. You may practice these TOP 10 assets feature. Building article template for the most popular journal articles We have added view counter on journal articles. We have already built VM service for the most popular articles too. Now let's build an article template for them. Setting up the default article type As mentioned earlier, there is a set of types of journal articles, for example, announcements, blogs, general, news, press-release, updates, article-tout, article-content, and so on. In real case, only some of these types will require view counter, for example article-content. Let's configure the default article type for mostly popular articles. We can add the following line at the end of portal-ext.properties. ext.most_popular_articles.article_type=article-content The code above shows that the default article type for most_popular_articles is article-content.
Read more
  • 0
  • 0
  • 3063

article-image-query-performance-tuning-microsoft-analysis-services-part-1
Packt
20 Oct 2009
41 min read
Save for later

Query Performance Tuning in Microsoft Analysis Services: Part 1

Packt
20 Oct 2009
41 min read
In this two-part article by Chris Webb, we will cover query performance tuning, including how to design aggregations and partitions and how to write efficient MDX. The first part will cover performance-specific design features, along with the concepts of partitions and aggregations. One of the main reasons for building Analysis Services cubes as part of a BI solution is because it should mean you get better query performance than if you were querying your relational database directly. While it's certainly the case that Analysis Services is very fast it would be naive to think that all of our queries, however complex, will return in seconds without any tuning being necessary. This article will describe the steps you'll need to go through in order to ensure your cube is as responsive as possible. How Analysis Services processes queries Before we start to discuss how to improve query performance, we need to understand what happens inside Analysis Services when a query is run. The two major parts of the Analysis Services engine are: The Formula Engine processes MDX queries, works out what data is needed to answer them, requests that data from the Storage Engine, and then performs all calculations needed for the query. The Storage Engine handles all reading and writing of data; it fetches the data that the Formula Engine requests when a query is run and aggregates it to the required granularity. When you run an MDX query, then, that query goes first to the Formula Engine where it is parsed; the Formula Engine then requests all of the raw data needed to answer the query from the Storage Engine, performs any calculations on that data that are necessary, and then returns the results in a cellset back to the user. There are numerous opportunities for performance tuning at all stages of this process, as we'll see. Performance tuning methodology When doing performance tuning there are certain steps you should follow to allow you to measure the effect of any changes you make to your cube, its calculations or the query you're running: Wherever possible, test your queries in an environment that is identical to your production environment. Otherwise ensure that the size of the cube and the server hardware you're running on is at least comparable, and running the same build of Analysis Services. Make sure that no-one else has access to the server you're running your tests on. You won't get reliable results if someone else starts running queries at the same time as you. Make sure that the queries you're testing with are equivalent to the ones that your users want to have tuned. As we'll see, you can use Profiler to capture the exact queries your users are running against the cube. Whenever you test a query, run it twice: first on a cold cache, and then on a warm cache. Make sure you keep a note of the time each query takes to run and what you changed on the cube or in the query for that run. Clearing the cache is a very important step—queries that run for a long time on a cold cache may be instant on a warm cache. When you run a query against Analysis Services, some or all of the results of that query (and possibly other data in the cube, not required for the query) will be held in cache so that the next time a query is run that requests the same data it can be answered from cache much more quickly. To clear the cache of an Analysis Services database, you need to execute a ClearCache XMLA command. To do this in SQL Management Studio, open up a new XMLA query window and enter the following: <Batch > <ClearCache> <Object> <DatabaseID>Adventure Works DW 2008</DatabaseID> </Object> </ClearCache></Batch> Remember that the ID of a database may not be the same as its name —you can check this by right-clicking on a database in the SQL Management Studio Object Explorer and selecting Properties. Alternatives to this method also exist: the MDX Studio tool allows you to clear the cache with a menu option, and the Analysis Services Stored Procedure Project (http://tinyurl.com/asstoredproc) contains code that allows you to clear the Analysis Services cache and the Windows File System cache directly from MDX. Clearing the Windows File System cache is interesting because it allows you to compare the performance of the cube on a warm and cold file system cache as well as a warm and cold Analysis Services cache: when the Analysis Services cache is cold or can't be used for some reason, a warm file system cache can still have a positive impact on query performance. After the cache has been cleared, before Analysis Services can answer a query it needs to recreate the calculated members, named sets and other objects defined in a cube's MDX script. If you have any reasonably complex named set expressions that need to be evaluated, you'll see some activity in Profiler relating to these sets being built and it's important to be able to distinguish between this and activity that's related to the queries you're actually running. All MDX Script related activity occurs between the Execute MDX Script Begin and Execute MDX Script End events; these are fired after the Query Begin event but before the Query Cube Begin event for the query run after the cache has been cleared. When looking at a Profiler trace you should either ignore everything between the Execute MDX Script Begin and End events or run a query that returns no data at all to trigger the evaluation of the MDX Script, for example: SELECT {} ON 0FROM [Adventure Works] Designing for performance Many of the recommendations for designing cubes we've given so far in this article have been given on the basis that they will improve query performance, and in fact the performance of a query is intimately linked to the design of the cube it's running against. For example, dimension design, especially optimizing attribute relationships, can have a significant effect on the performance of all queries—at least as much as any of the optimizations described in this article. As a result, we recommend that if you've got a poorly performing query the first thing you should do is review the design of your cube to see if there is anything you could do differently. There may well be some kind of trade-off needed between usability, manageability, time-to-develop, overall "elegance" of the design and query performance, but since query performance is usually the most important consideration for your users then it will take precedence. To put it bluntly, if the queries your users want to run don't run fast your users will not want to use the cube at all! Performance-specific design features Once you're sure that your cube design is as good as you can make it, it's time to look at two features of Analysis Services that are transparent to the end user but have an important impact on performance and scalability: measure group partitioning and aggregations. Both of these features relate to the Storage Engine and allow it to answer requests for data from the Formula Engine more efficiently. Partitions A partition is a data structure that holds some or all of the data held in a measure group. When you create a measure group, by default that measure group contains a single partition that contains all of the data. Enterprise Edition of Analysis Services allows you to divide a measure group into multiple partitions; Standard Edition is limited to one partition per measure group, and the ability to partition is one of the main reasons why you would want to use Enterprise Edition over Standard Edition. Why partition? Partitioning brings two important benefits: better manageability and better performance. Partitions within the same measure group can have different storage modes and different aggregation designs, although in practice they usually don't differ in these respects; more importantly they can be processed independently, so for example when new data is loaded into a fact table, you can process only the partitions that should contain the new data. Similarly, if you need to remove old or incorrect data from your cube, you can delete or reprocess a partition without affecting the rest of the measure group. Partitioning can also improve both processing performance and query performance significantly. Analysis Services can process multiple partitions in parallel and this can lead to much more efficient use of CPU and memory resources on your server while processing is taking place. Analysis Services can also fetch and aggregate data from multiple partitions in parallel when a query is run too, and again this can lead to more efficient use of CPU and memory and result in faster query performance. Lastly, Analysis Services will only scan the partitions that contain data necessary for a query and since this reduces the overall amount of IO needed this can also make queries faster. Building partitions You can view, create and delete partitions on the Partitions tab of the Cube Editor in BIDS. When you run the New Partition Wizard or edit the Source property of an existing partition, you'll see you have two options for controlling what data is used in the partition: Table Binding means that the partition contains all of the data in a table or view in your relational data source, or a named query defined in your DSV. You can choose the table you wish to bind to on the Specify Source Information step of the New Partition Wizard, or in the Partition Source dialog if you choose Table Binding from the Binding Type drop-down box. Query Binding allows you to specify an SQL SELECT statement to filter the rows you want from a table; BIDS will automatically generate part of the SELECT statement for you, and all you'll need to do is supply the WHERE clause. If you're using the New Partition Wizard, this is the option that will be chosen if you check the Specify a query to restrict rows checkbox on the second step of the wizard; in the Partition Source dialog you can choose this option from the Binding Type drop-down box. It might seem like query binding is the easiest way to filter your data, and while it's the most widely-used approach it does have one serious shortcoming: since it involves hard-coding an SQL SELECT statement into the definition of the partition, changes to your fact table such as the deletion or renaming of a column can mean the SELECT statement errors when it is run if that column is referenced in it. This means in turn will cause the partition processing to fail.. If you have a lot of partitions in your measure group—and it's not unusual to have over one hundred partitions on a large cube—altering the query used for each one is somewhat time-consuming. Instead, table-binding each partition to a view in your relational database will make this kind of maintenance much easier, although you do of course now need to generate one view for each partition. Alternatively, if you're building query-bound partitions from a single view on top of your fact table (which means you have complete control over the columns the view exposes), you could use a query like SELECT * FROM in each partition’s definition. It's very important that you check the queries you're using to filter your fact table for each partition. If the same fact table row appears in more than one partition, or if fact table rows don't appear in any partition, this will result in your cube displaying incorrect measure values. On the Processing and Storage Locations step of the wizard you have the chance to create the partition on a remote server instance, functionality that is called Remote Partitions. This is one way of scaling out Analysis Services: you can have a cube and measure group on one server but store some of the partitions for the measure group on a different server, something like a linked measure group but at a lower level. This can be useful for improving processing performance in situations when you have a very small time window available for processing but in general we recommend that you do not use remote partitions. They have an adverse effect on query performance and they make management of the cube (especially backup) very difficult. Also on the same step you have the chance to store the partition at a location other than the default of the Analysis Services data directory. Spreading your partitions over more than one volume may make it easier to improve the IO performance of your solution, although again it can complicate database backup and restore. After assigning an aggregation design to the partition (we'll talk about aggregations in detail next), the last important property to set on a partition is Slice. The Slice property takes the form of an MDX member, set or tuple—MDX expressions returning members, sets or tuples are not allowed however - and indicates what data is present in a partition. While you don't have to set it, we strongly recommend that you do so, even for MOLAP partitions, for the following reasons: While Analysis Services does automatically detect what data is present in a partition during processing, it doesn't always work as well as you'd expect and can result in unwanted partition scanning taking place at query time in a number of scenarios. The following blog entry on the SQLCat team site explains why in detail: http://tinyurl.com/partitionslicing It acts as a useful safety mechanism to ensure that you only load the data you're expecting into a partition. If, while processing, Analysis Services finds that data is being loaded into the partition that conflicts with what's specified in the Slice property, then processing will fail. More detail on how to set the Slice property can be found in Mosha Pasumansky's blog entry on the subject here: http://tinyurl.com/moshapartition Planning a partitioning strategy We now know why we should be partitioning our measure groups and what to do to create a partition; the next question is: how should we split the data in our partitions? We need to find some kind of happy medium between the manageability and performance aspects of partitioning—we need to split our data so that we do as little processing as possible, but also so that as few partitions are scanned as possible by our users' queries. Luckily, if we partition by our Time dimension we can usually meet both needs very well: it's usually the case that when new data arrives in a fact table it's for a single day, week or month, and it's also the case that the most popular way of slicing a query is by a time period. Therefore, it's almost always the case that when measure groups are partitioned they are partitioned by time. It's also worth considering, though, if it's a good idea to partition by time and another dimension: for example, in an international company you might have a Geography dimension and a Country attribute, and users may always be slicing their queries by Country too—in which case it might make sense to partition by Country. Measure groups that contain measures with the Distinct Count aggregation type require their own specific partitioning strategy. While you should still partition by time, you should also partition by non-overlapping ranges of values within the column you're doing the distinct count on. A lot more detail on this is available in the following white paper: http://tinyurl.com/distinctcountoptimize It's worth looking at the distribution of data over partitions for dimensions we're not explicitly slicing by, as there is often a dependency between data in these dimensions and the Time dimension: for example, a given Product may only have been sold in certain Years or in certain Countries. You can see the distribution of member DataIDs (the internal key values that Analysis Services creates for all members on a hierarchy) for a partition by querying the Discover_Partition_Dimension_Stat DMV, for example: SELECT *FROM SystemRestrictSchema($system.Discover_Partition_Dimension_Stat ,DATABASE_NAME = 'Adventure Works DW 2008' ,CUBE_NAME = 'Adventure Works' ,MEASURE_GROUP_NAME = 'Internet Sales' ,PARTITION_NAME = 'Internet_Sales_2003') The following screenshot shows what the results of this query look like: There's also a useful Analysis Services stored procedure that shows the same data and any partition overlaps included in the Analysis Services Stored Procedure Project (a free, community-developed set of sample Analysis Services stored procedures): http://tinyurl.com/partitionhealth. This blog entry describes how you can take this data and visualise it in a Reporting Services report: http://tinyurl.com/viewpartitionslice We also need to consider what size our partitions should be. In general between 5 and 20 million rows per partition, or up to around 3GB, is a good size. If you have a measure group with a single partition of below 5 million rows then don't worry, it will perform very well, but it's not worth dividing it into smaller partitions; it's equally possible to get good performance with partitions of 50-60 million rows. It's also best to avoid having too many partitions as well—if you have more than a few hundred it may make SQL Management Studio and BIDS slow to respond, and it may be worth creating fewer, larger partitions assuming these partitions stay within the size limits for a single partition we've just given. Automatically generating large numbers of partitionsWhen creating a measure group for the first time, it's likely you'll already have a large amount of data and may need to create a correspondingly large number of partitions for it. Clearly the last thing you'll want to do is create tens or hundreds of partitions manually and it's worth knowing some tricks to create these partitions automatically. One method involves taking a single partition, scripting it out to XMLA and then pasting and manipulating this in Excel, as detailed here: http://tinyurl.com/generatepartitions. The Analysis Services Stored Procedure Project also contains a set of functions for creating partitions automatically based on MDX set expressions: http://tinyurl.com/autopartition. Unexpected partition scans Even when you have configured your partitions properly it's sometimes the case that Analysis Services will scan partitions that you don't expect it to be scanning for a particular query. If you see this happening the first thing to determine is whether these extra scans are making a significant contribution to your query times. If they aren't, then it's probably not worth worrying about; if they are, there are some things to try to attempt to stop it happening. The extra scans could be the result of a number of factors, including: The way you have written MDX for queries or calculations. In most cases it will be very difficult to rewrite the MDX to stop the scans, but the following blog entry describes how it is possible in one scenario: http://tinyurl.com/moshapart The LastNonEmpty measure aggregation type may result in multiple partition scans. If you can restructure your cube so you can use the LastChild aggregation type, Analysis Services will only scan the last partition containing data for the current time period. In some cases, even when you've set the Slice property, Analysis Services has trouble working out which partitions should be scanned for a query. Changing the attributes mentioned in the Slice property may help, but not always. The section on Related Attributes and Almost Related Attributes in the following blog entry discusses this in more detail: http://tinyurl.com/mdxpartitions Analysis Services may also decide to retrieve more data than is needed for a query to make answering future queries more efficient. This behavior is called prefetching and can be turned off by setting the following connection string properties: Disable Prefetch Facts=True; Cache Ratio=1 More information on this can be found in the section on Prefetching and Request Ordering in the white paper Identifying and Resolving MDX Query Bottleneck available from http://tinyurl.com/mdxbottlenecks Note that setting these connection string properties can have other, negative effects on query performance. You can set connection string properties in SQL Management Studio when you open a new MDX Query window. Just click the Options button on the Connect to Analysis Services dialog, then go to the Additional Connection Parameters tab. Note that in the RTM version of SQL Management Studio there is a problem with this functionality, so that when you set a connection string property it will continue to be set for all connections, even though the textbox on the Additional Connection Parameters tab is blank, until SQL Management Studio is closed down or until you set the same property differently. Aggregations An aggregation is simply a pre-summarised data set, similar to the result of an SQL SELECT statement with a GROUP BY clause, that Analysis Services can use when answering queries. The advantage of having aggregations built in your cube is that it reduces the amount of aggregation that the Analysis Services Storage Engine has to do at query time, and building the right aggregations is one of the most important things you can do to improve query performance. Aggregation design is an ongoing process that should start once your cube and dimension designs have stabilised and which will continue throughout the lifetime of the cube as its structure and the queries you run against it change; in this section we'll talk about the steps you should go through to create an optimal aggregation design. Creating an initial aggregation design The first stage in creating an aggregation design should be to create a core set of aggregations that will be generally useful for most queries run against your cube. This should take place towards the end of the development cycle when you're sure that your cube and dimension designs are unlikely to change much, because any changes are likely to invalidate your aggregations and mean this step will have to be repeated. It can't be stressed enough that good dimension design is the key to getting the most out of aggregations: removing unnecessary attributes, setting AttributeHierarchyEnabled to False where possible, building optimal attribute relationships and building user hierarchies will all make the aggregation design process faster, easier and more effective. You should also take care to update the EstimatedRows property of each measure group and partition, and the EstimatedCount of each attribute before you start, and these values are also used by the aggregation design process. BIDS Helper adds a new button to the toolbar in thePartitions tab of the Cube Editor to update all of these count properties with one click. To build this initial set of aggregations we'll be running the Aggregation Design Wizard, which can be run by clicking the Design Aggregations button on the toolbar of the Aggregations tab of the Cube Editor. This wizard will analyse the structure of your cube and dimensions, look at various property values you've set, and try to come up with a set of aggregations that it thinks should be useful. The one key piece of information it doesn't have at this point is what queries you're running against the cube, so some of the aggregations it designs may not prove to be useful in the long-run, but running the wizard is extremely useful for creating a first draft of your aggregation designs. You can only design aggregations for one measure group at a time; if you have more than one partition in the measure group you've selected then the first step of the wizard asks you to choose which partitions you want to design aggregations for. An aggregation design can be associated with many partitions in a measure group, and a partition can be associated with just one aggregation design or none at all. We recommend that, in most cases, you have just one aggregation design for each measure group for the sake of simplicity. However if processing time is limited and you need to reduce the overall time spent building aggregations, or if query patterns are different for different partitions within the same measure group, then it may make sense to apply different aggregation designs to different partitions. The next step of the wizard asks you to review the AggregationUsage property of all the attributes on all of the cube dimensions in your cube; this property can also be set on the Cube Structure tab of the Cube Editor. The following figure shows the Aggregation Design Wizard: The AggregationUsage property controls how dimension attributes are treated in the aggregation design process. The property can be set to the following values: Full: This means the attribute, or an attribute at a lower granularity directly related to it by an attribute relationship, will be included in every single aggregation the wizard builds. We recommend that you use this value sparingly, for at most one or two attributes in your cube, because it can significantly reduce the number of aggregations that get built. You should set it for attributes that will almost always get used in queries. For example, if the vast majority of your queries are at the Month granularity it makes sense that all of your aggregations include the Month attribute from your Time dimension. None: This means the attribute will not be included in any aggregation that the wizard designs. Don't be afraid of using this value for any attributes that you don't think will be used often in your queries, it can be a useful way of ensuring that the attributes that are used often get good aggregation coverage. Note that Attributes with AttributeHierarchyEnabled set to False will have no aggregations designed for them anyway. Unrestricted: This means that the attribute may be included in the aggregations designed, depending on whether the algorithm used by the wizard considers it to be useful or not. Default: The default option applies a complex set of rules, which are: The granularity attribute (usually the key attribute, unless you specified otherwise in the dimension usage tab) is treated as Unrestricted. All attributes on dimensions involved in many-to-many relationships, unmaterialised referenced relationships, and data mining dimensions are treated as None. Aggregations may still be built at the root granularity, that is, the intersection of every All Member on every attribute. All attributes that are used as levels in natural user hierarchies are treated as Unrestricted. Attributes with IsAggregatable set to False are treated as Full. All other attributes are treated as None The next step in the wizard asks you to verify the number of EstimatedRows and EstimatedCount properties we've already talked about, and gives the option of setting a similar property that shows the estimated number of members from an attribute that appear in any one partition. This can be an important property to set: if you are partitioning by month, although you may have 36 members on your Month attribute a partition will only contain data for one of them. On the Set Aggregation Options step you finally reach the point where some aggregations can be built. Here you can apply one last set of restrictions on the set of aggregations that will be built, choosing to either: Estimated Storage Reaches, which means you build aggregations to fill a given amount of disk space. Performance Gain Reaches, the most useful option. It does not mean that all queries will run n% faster; nor does it mean that a query that hits an aggregation directly will run n% faster. Think of it like this: if the wizard built all the aggregations it thought were useful to build (note: this is not the same thing as all of the possible aggregations that could be built on the cube) then, in general, performance would be better. Some queries would not benefit from aggregations, some would be slightly faster, and some would be a lot faster; and some aggregations would be more often used than others. So if you set this property to 100% the wizard would build all the aggregations that it could, and you'd get 100% of the performance gain possible from building aggregations. Setting this property to 30%, the default and recommended value, will build the aggregations that give you 30% of this possible performance gain—not 30% of the possible aggregations, usually a much smaller number. As you can see from the screenshot below, the graph drawn on this step plots the size of the aggregations built versus overall performance gain, and the shape of the curve shows that a few, smaller aggregations usually provide the majority of the performance gain. I Click Stop, which means carry on building aggregations until you click the Stop button. Designing aggregations can take a very long time, especially on more complex cubes, because there may literally be millions or billions of possible aggregations that could be built. In fact, it's not unheard of for the aggregation design wizard to run for several days before it's stopped! Do Not Design Aggregations allows you to skip designing aggregations. The approach we suggest taking here is to first select I Click Stop and then click the Start button. On some measure groups this will complete very quickly, with only a few small aggregations built. If that's the case click Next; otherwise, if it's taking too long or too many aggregations are being built, click Stop and then Reset, and then select Performance Gain Reaches and enter 30% and Start again. This should result in a reasonable selection of aggregations being built; in general around 50-100 aggregations is the maximum number you should be building for a measure group, and if 30% leaves you short of this try increasing the number by 10% until you feel comfortable with what you get. On the final step of the wizard, enter a name for your aggregation design and save it. It's a good idea to give the aggregation design a name including the name of the measure group to make it easier to find if you ever need to script it to XMLA. It's quite common that Analysis Services cube developers stop thinking about aggregation design at this point. This is a serious mistake: just because you have run the Aggregation Design Wizard does not mean you have built all the aggregations you need, or indeed any useful ones at all! Doing Usage-Based Optimisation and/or building aggregations manually is absolutely essential. Usage-based optimization We now have some aggregations designed, but the chances are that despite our best efforts many of them will not prove to be useful. To a certain extent we might be able to pick out these aggregations by browsing through them; really, though, we need to know what queries our users are going to run before we can build aggregations to make them run faster. This is where usage-based optimisation comes in: it allows us to log the requests for data that Analysis Services makes when a query is run and then feed this information into the aggregation design process. To be able to do usage-based optimization, you must first set up Analysis Services to log these requests for data. This involves specifying a connection string to a relational database in the server properties of your Analysis Services instance and allowing Analysis Services to create a log table in that database. The white paper Configuring the Analysis Services Query Log contains more details on how to do this (it's written for Analysis Services 2005 but is still relevant for Analysis Services 2008), and can be downloaded from http://tinyurl.com/ssasquerylog. The query log is a misleading name, because as you'll see if you look inside it it doesn't actually contain the text of MDX queries run against the cube. When a user runs an MDX query, Analysis Services decomposes it into a set of requests for data at a particular granularity and it's these requests that are logged; we'll look at how to interpret this information in the next section. A single query can result in no requests for data, or it can result in as many as hundreds or thousands of requests, especially if it returns a lot of data and a lot of MDX calculations are involved. When setting up the log you also have to specify the percentage of all data requests that Analysis Services actually logs with the QueryLogSampling property—in some cases if it logged every single request you would end up with a very large amount of data very quickly, but on the other hand if you set this value too low you may end up not seeing certain important long-running requests. We recommend that you start by setting this property to 100 but that you monitor the size of the log closely and reduce the value if you find that the number of requests logged is too high. Once the log has been set up, let your users start querying the cube. Explain to them what you're doing and that some queries may not perform well at this stage. Given access to a new cube it will take them a little while to understand what data is present and what data they're interested in; if they're new to Analysis Services it's also likely they'll need some time to get used to whatever client tool they're using. Therefore you'll need to have logging enabled for at least a month or two before you can be sure that your query log contains enough useful information. Remember that if you change the structure of the cube while you're logging then the existing contents of the log will no longer be usable. Last of all, you'll need to run the Usage-Based Optimisation Wizard to build new aggregations using this information. The Usage-Based Optimisation Wizard is very similar to the Design Aggregations Wizard, with the added option to filter the information in the query log by date, user and query frequency before it's used to build aggregations. It's a good idea to do this filtering carefully: you should probably exclude any queries you've run yourself, for example, since they're unlikely to be representative of what the users are doing, and make sure that the most important users queries are over-represented. Once you've done this you'll have a chance to review what data is actually going to be used before you actually build the aggregations. On the last step of the wizard you have the choice of either creating a new aggregation design or merging the aggregations that have just been created with an existing aggregation design. We recommend the latter: what you've just done is optimize queries that ran slowly on an existing aggregation design, and if you abandon the aggregations you've already got then it's possible that queries which previously had been quick would be slow afterwards. This exercise should be repeated at regular intervals throughout the cube's lifetime to ensure that you built any new aggregations that are necessary as the queries that your users run change. Query logging can, however, have an impact on query performance so it's not a good idea to leave logging running all the time. Processing aggregationsWhen you've created or edited the aggregations on one or more partitions, you don't need to do a full process on the partitions. All you need to do is to deploy your changes and then run a ProcessIndex, which is usually fairly quick, and once you've done that queries will be able to use the new aggregations. When you run a ProcessIndex Analysis Services does not need to run any SQL queries against the relational data source if you're using MOLAP storage. Monitoring partition and aggregation usage Having created and configured your partitions and aggregations, you'll naturally want to be sure that when you run a query Analysis Services is using them as you expect. You can do this very easily by running a trace with SQL Server Profiler or by using MDX Studio (a free MDX Editor that can be downloaded from http://tinyurl.com/mdxstudio). To use Profiler, start it and then connect to your Analysis Services instance to create a new trace. On the Trace Properties dialog choose the Blank template and go to the Events Selection tab and check the following: Progress ReportsProgress Report Begin Progress ReportsProgress Report End Queries EventsQuery Begin Queries EventsQuery End Query ProcessingExecute MDX Script Begin Query ProcessingExecute MDX Script End Query ProcessingQuery Cube Begin Query ProcessingQuery Cube End Query ProcessingGet Data From Aggregation Query ProcessingQuery Subcube Verbose Then clear the cache and click Run to start the trace. Once you've done this you can either open up your Analysis Services client tool or you can start running MDX queries in SQL Management Studio. When you do this you'll notice that Profiler starts to show information about what Analysis Services is doing internally to answer these queries. The following screenshot shows what you might typically see: Interpreting the results of a Profiler trace is a complex task and well outside the scope of this article, but it's very easy to pick out some useful information relating to aggregation and partition usage. Put simply: The Query Subcube Verbose events represent individual requests for data from the Formula Engine to the Storage Engine, which can be answered either from cache, an aggregation or base-level partition data. Each of these requests is at a single granularity, meaning that all of the data in the request comes from a single distinct combination of attributes; we refer to these granularities as "subcubes". The TextData column for this event shows the granularity of data that is being requested in human readable form; the Query Subcube event will display exactly the same data but in the less friendly-format that the Usage-Based Optimisation Query Log uses. Pairs of Progress Report Begin and Progress Report End events show that data is being read from disk, either from an aggregation or a partition. The TextData column gives more information, including the name of the object being read; however, if you have more than one object (for example an aggregation) with the same name, you need to look at the contents of the ObjectPath column to see what object exactly is being queried. The Get Data From Aggregation event is fired when data is read from an aggregation, in addition to any Progress Report events. The Duration column shows how long each of these operations takes in milliseconds. At this point in the cube optimisation process you should be seeing in Profiler that when your users run queries they hit as few partitions as possible and hit aggregations as often as possible. If you regularly see slow queries that scan all the partitions in your cube or which do not use any aggregations at all, you should consider going back to the beginning of the process and rethinking your partitioning strategy and rerunning the aggregation design wizards. In a production system many queries will be answered from cache and therefore be very quick, but you should always try to optimise for the worst-case scenario of a query running on a cold cache. Building aggregations manually However good the aggregation designs produced by the wizards are, it's very likely that at some point you'll have to design aggregations manually for particular queries. Even after running the Usage Based Optimisation Wizard you may find that it still does not build some potentially useful aggregations: the algorithm the wizards use is very complex and something of a black box, so for whatever reason (perhaps because it thinks it would be too large) it may decide not to build an aggregation that, when built manually, turns out to have a significant positive impact on the performance of a particular query. Before we can build aggregations manually we need to work out which aggregations we need to build. To do this, we once again need to use Profiler and look at either the Query Subcube or the Query Subcube Verbose events. These events, remember, display the same thing in two different formats - requests for data made to the Analysis Services storage engine during query processing - and the contents of the Duration column in Profiler will show how long in milliseconds each of these requests took. A good rule of thumb is that any Query Subcube event that takes longer than half a second (500 ms) would benefit from having an aggregation built for it; you can expect that a Query Subcube event that requests data at the same granularity as an aggregation will execute almost instantaneously. The following screenshot shows an example of trace on an MDX query that takes 700ms: The single Query Subcube Verbose event is highlighted, and we can see that the duration of this event is the same as that of the query itself, so if we want to improve the performance of the query we need to build an aggregation for this particular request. Also, in the lower half of the screen we can see the contents of the TextData column displayed. This shows a list of all the dimensions and attributes from which data is being requested —the granularity of the request—and the simple rule to follow here is that whenever you see anything other than a zero by an attribute we know that the granularity of the request includes this attribute. We need to make a note of all of the attributes which have anything other than a zero next to them and then build an aggregation using them; in this case it's just the Product Category attribute of the Product dimension. The white paper Identifying and Resolving MDX Query Performance Bottlenecks (again, written for Analysis Services 2005 but still relevant for Analysis Services 2008), available from http://tinyurl.com/mdxbottlenecks, includes more detailed information on how to interpret the information given by the Query Subcube Verbose event. So now that we know what aggregation we need to build, we need to go ahead and build it. We have a choice of tools to do this: we can either use the functionality built into BIDS, or we can use some of the excellent functionality that BIDS Helper provides. In BIDS, to view and edit aggregations, you need to go to the Aggregations tab in the cube editor. On the Standard View you only see a list of partitions and which aggregation designs they have associated with them; if you switch to the Advanced View by pressing the appropriate button on the toolbar, you can view the aggregations in each aggregation design for each measure group. If you right-click in the area where the aggregations are displayed you can also create a new aggregation and once you've done that you can specify the granularity of the aggregation by checking and unchecking the attributes on each dimension. For our particular query we only need to check the box next to the Product Categories attribute, as follows: The small tick at the top of the list of dimensions in the Status row shows that this aggregation has passed the built-in validation rules that BIDS applies to make sure this is a useful aggregation. If you see an amber warning triangle here, hover over it with your mouse and in the tooltip you'll see a list of reasons why the aggregation has failed its status check. If we then deploy and run a ProcessIndex, we can then rerun our original query and watch it use the new aggregation, running much faster as a result: The problem with the native BIDS aggregation design functionality is that it becomes difficult to use when you have complex aggregations to build and edit. The functionality present in BIDS Helper, while it looks less polished, is far more useable and offers many benefits over the BIDS native functionality, for example: The BIDS Helper Aggregation Design interface displays the aggregation granularity in the same way (ie using 1s and 0s, as seen in the screenshot below) as the Query Subcube event in Profiler does, making it easier to cross reference between the two. It also shows attribute relationships when it displays the attributes on each dimension when you're designing an aggregation, as seen on the righthand side in the screenshot that follows. This is essential to being able to build optimal aggregations. It also shows whether an aggregation is rigid or flexible. It has functionality to remove duplicate aggregations and ones containing redundant attributes (see below), and search for similar aggregations. It allows you to create new aggregations based on the information stored in the Query Log. It also allows you to delete unused aggregations based on information from a Profiler trace. Finally, it has some very comprehensive functionality to allow you to test the performance of the aggregations you build (see http://tinyurl.com/testaggs). Unsurprisingly, if you need to do any serious work designing aggregations manually we recommend using BIDS Helper over the built-in functionality. Common aggregation design issues Several features of your cube design must be borne in mind when designing aggregations, because they can influence how Analysis Services storage engine queries are made and therefore which aggregations will be used. These include: There's no point building aggregations above the granularity you are slicing your partitions by. Aggregations are built on a per-partition basis, so for example if you're partitioning by month there's no value in building an aggregation at the Year granularity since no aggregation can contain more than one month's worth of data. It won't hurt if you do it, it just means that an aggregation at month will be the same size as one at year but useful to more queries. It follows from this that it might be possible to over-partition data and reduce the effectiveness of aggregations, but we have anecdotal evidence from people who have built very large cubes that this is not an issue. For queries involving a dimension with a many-to-many relationship to a measure group, aggregations must not be built using any attributes from the many-to-many dimension, but instead must be built at the granularity of the attributes with a regular relationship to the intermediate measure group. When a query is run using the Sales Reason dimension Analysis Services fi rst works out which Sales Orders relate to each Sales Reason, and then queries the main measure group for these Sales Orders. Therefore, only aggregations at the Sales Order granularity on the main measure group can be used. As a result, in most cases it's not worth building aggregations for queries on many-to-many dimensions since the granularity of these queries is often close to that of the original fact table. Queries involving measures which have semi-additive aggregation types are always resolved at the granularity attribute of the time dimension, so you need to include that attribute in all aggregations. Queries involving measures with measure expressions require aggregations at the common granularity of the two measure groups involved. You should not build aggregations including a parent/child attribute; instead you should use the key attribute in aggregations. No aggregation should include an attribute which has AttributeHierarchyEnabled set to False. No aggregation should include an attribute that is below the granularity attribute of the dimension for the measure group. Any attributes which have a default member that is anything other than the All Member, or which have IsAggregatable set to False, should also be included in all aggregations. Aggregations and indexes are not built for partitions with fewer than 4096 rows. This threshold is set by the IndexBuildThreshold property in msmdsrv.ini; you can change it but it's not a good idea to do so. Aggregations should not include redundant attributes, that is to say attributes from the same 'chain' of attribute relationships. For example if you had a chain of attribute relationships going from month to quarter to year, you should not build an aggregation including month and quarter—it should just include month. This will increase the chance that the aggregation can be used by more queries, as well as reducing the size of the aggregation. Summary In this part of the article we covered performance-specific design features such as partitions and aggregations. In the next part, we will cover MDX calculation performance and caching.
Read more
  • 0
  • 0
  • 10985

article-image-query-performance-tuning-microsoft-analysis-services-part-2
Packt
20 Oct 2009
21 min read
Save for later

Query Performance Tuning in Microsoft Analysis Services: Part 2

Packt
20 Oct 2009
21 min read
MDX calculation performance Optimizing the performance of the Storage Engine is relatively straightforward: you can diagnose performance problems easily and you only have two options—partitioning and aggregation—for solving them. Optimizing the performance of the Formula Engine is much more complicated because it requires knowledge of MDX, diagnosing performance problems is difficult because the internal workings of the Formula Engine are hard to follow, and solving the problem is reliant on knowing tips and tricks that may change from service pack to service pack. Diagnosing Formula Engine performance problems If you have a poorly-performing query, and if you can rule out the Storage Engine as the cause of the problem, then the issue is with the Formula Engine. We've already seen how we can use Profiler to check the performance of Query Subcube events, to see which partitions are being hit and to check whether aggregations are being used; if you subtract the sum of the durations of all the Query Subcube events from the duration of the query as a whole, you'll get the amount of time spent in the Formula Engine. You can use MDX Studio's Profile functionality to do the same thing much more easily—here's a screenshot of what it outputs when a calculation-heavy query is run: The following blog entry describes this functionality in detail: http://tinyurl.com/mdxtrace; but what this screenshot displays is essentially the same thing that we'd see if we ran a Profiler trace when running the same query on a cold and warm cache, but in a much more easy-to-read format. The column to look at here is the Ratio to Total, which shows the ratio of the duration of each event to the total duration of the query. We can see that on both a cold cache and a warm cache the query took almost ten seconds to run but none of the events recorded took anywhere near that amount of time: the highest ratio to parent is 0.09%. This is typical of what you'd see with a Formula Engine-bound query. Another hallmark of a query that spends most of its time in the Formula Engine is that it will only use one CPU, even on a multiple-CPU server. This is because the Formula Engine, unlike the Storage Engine, is single-threaded. As a result if you watch CPU usage in Task Manager while you run a query you can get a good idea of what's happening internally: high usage of multiple CPUs indicates work is taking place in the Storage Engine, while high usage of one CPU indicates work is taking place in the Formula Engine. Calculation performance tuning Having worked out that the Formula Engine is the cause of a query's poor performance then the next step is, obviously, to try to tune the query. In some cases you can achieve impressive performance gains (sometimes of several hundred percent) simply by rewriting a query and the calculations it depends on; the problem is knowing how to rewrite the MDX and working out which calculations contribute most to the overall query duration. Unfortunately Analysis Services doesn't give you much information to use to solve this problem and there are very few tools out there which can help either, so doing this is something of a black art. There are three main ways you can improve the performance of the Formula Engine: tune the structure of the cube it's running on, tune the algorithms you're using in your MDX, and tune the implementation of those algorithms so they use functions and expressions that Analysis Services can run efficiently. We've already talked in depth about how the overall cube structure is important for the performance of the Storage Engine and the same goes for the Formula Engine; the only thing to repeat here is the recommendation that if you can avoid doing a calculation in MDX by doing it at an earlier stage, for example in your ETL or in your relational source, and do so without compromising functionality, you should do so. We'll now go into more detail about tuning algorithms and implementations. Mosha Pasumansky's blog, http://tinyurl.com/moshablog, is a goldmine of information on this subject. If you're serious about learning MDX we recommend that you subscribe to it and read everything he's ever written. Tuning algorithms used in MDX Tuning an algorithm in MDX is much the same as tuning an algorithm in any other kind of programming language—it's more a matter of understanding your problem and working out the logic that provides the most efficient solution than anything else. That said, there are some general techniques that can be used often in MDX and which we will walk through here. Using named sets to avoid recalculating set expressions Many MDX calculations involve expensive set operations, a good example being rank calculations where the position of a tuple within an ordered set needs to be determined. The following query includes a calculated member that displays Dates on the Rows axis of a query, and on columns shows a calculated measure that returns the rank of that date within the set of all dates based on the value of the Internet Sales Amount measure: WITH MEMBER MEASURES.MYRANK AS Rank ( [Date].[Date].CurrentMember ,Order ( [Date].[Date].[Date].MEMBERS ,[Measures].[Internet Sales Amount] ,BDESC ) )SELECT MEASURES.MYRANK ON 0 ,[Date].[Date].[Date].MEMBERS ON 1 FROM [Adventure Works] It runs very slowly, and the problem is that every time the calculation is evaluated it has to evaluate the Order function to return the set of ordered dates. In this particular situation, though, you can probably see that the set returned will be the same every time the calculation is called, so it makes no sense to do the ordering more than once. Instead, we can create a named set hold the ordered set and refer to that named set from within the calculated measure, so: WITH SET ORDEREDDATES AS Order ( [Date].[Date].[Date].MEMBERS ,[Measures].[Internet Sales Amount] ,BDESC ) MEMBER MEASURES.MYRANK AS Rank ( [Date].[Date].CurrentMember ,ORDEREDDATES ) SELECT MEASURES.MYRANK ON 0 ,[Date].[Date].[Date].MEMBERS ON 1 FROM [Adventure Works] This version of the query is many times faster, simply as a result of improving the algorithm used; the problem is explored in more depth in this blog entry: http://tinyurl.com/mosharank Since normal named sets are only evaluated once they can be used to cache set expressions in some circumstances; however, the fact that they are static means they can be too inflexible to be useful most of the time. Note that normal named sets defined in the MDX Script are only evaluated once, when the MDX script executes and not in the context of any particular query, so it wouldn't be possible to change the example above so that the set and calculated measure were defined on the server. Even named sets defined in the WITH clause are evaluated only once, in the context of the WHERE clause, so it wouldn't be possible to crossjoin another hierarchy on columns and use this approach, because for it to work the set would have to be reordered once for each column. The introduction of dynamic named sets in Analysis Services 2008 improves the situation a little, and other more advanced techniques can be used to work around these issues, but in general named sets are less useful than you might hope. For further reading on this subject see the following blog posts: http://tinyurl.com/chrisrankhttp://tinyurl.com/moshadsetshttp://tinyurl.com/chrisdsets Using calculated members to cache numeric values In the same way that you can avoid unnecessary re-evaluations of set expressions by using named sets, you can also rely on the fact that the Formula Engine can (usually) cache the result of a calculated member to avoid recalculating expressions which return numeric values. What this means in practice is that anywhere in your code you see an MDX expression that returns a numeric value repeated across multiple calculations, you should consider abstracting it to its own calculated member; not only will this help performance, but it will improve the readability of your code. For example, take the following slow query which includes two calculated measures: WITH MEMBER [Measures].TEST1 AS [Measures].[Internet Sales Amount] / Count ( TopPercent ( { [Scenario].[Scenario].&[1] ,[Scenario].[Scenario].&[2] }* [Account].[Account].[Account].MEMBERS* [Date].[Date].[Date].MEMBERS ,10 ,[Measures].[Amount] ) )MEMBER [Measures].TEST2 AS [Measures].[Internet Tax Amount] / Count ( TopPercent ( { [Scenario].[Scenario].&[1] ,[Scenario].[Scenario].&[2] }* [Account].[Account].[Account].MEMBERS* [Date].[Date].[Date].MEMBERS* [Department].[Departments].[Department Level 02].MEMBERS ,10 ,[Measures].[Amount] ) )SELECT { [Measures].TEST1 ,[Measures].TEST2 } ON 0 ,[Customer].[Gender].[Gender].MEMBERS ON 1FROM [Adventure Works] A quick glance over the code shows that a large section of it occurs twice in both calculations—everything inside the Count function. If we remove that code to its own calculated member as follows: WITH MEMBER [Measures].Denominator AS Count ( TopPercent ( { [Scenario].[Scenario].&[1] ,[Scenario].[Scenario].&[2] }* [Account].[Account].[Account].MEMBERS* [Date].[Date].[Date].MEMBERS ,10 ,[Measures].[Amount] ) )MEMBER [Measures].TEST1 AS [Measures].[Internet Sales Amount] / [Measures].DenominatorMEMBER [Measures].TEST2 AS [Measures].[Internet Tax Amount] / [Measures].DenominatorSELECT { [Measures].TEST1 ,[Measures].TEST2 } ON 0 ,[Customer].[Gender].[Gender].MEMBERS ON 1FROM [Adventure Works] The query runs much faster, simply because instead of evaluating the count twice for each of the two visible calculated measures, we evaluate it once, cache the result in the calculated measure Denominator and then reference this in the other calculated measures. It's also possible to find situations where you can rewrite code to avoid evaluating a calculation that always returns the same result over different cells in the multidimensional space of the cube. This is much more difficult to do effectively though; the following blog entry describes how to do it in detail: http://tinyurl.com/fecache Tuning the implementation of MDX Like just about any other software product, Analysis Services is able to do some things more efficiently than others. It's possible to write the same query or calculation using the same algorithm but using different MDX functions and see a big difference in performance; as a result, we need to know which are the functions we should use and which ones we should avoid. Which ones are these though? Luckily MDX Studio includes functionality to analyse MDX code and flag up such problems—to do this you just need to click the Analyze button—and there's even an online version of MDX Studio that allows you to do this too, available at: http://mdx.mosha.com/. We recommend that you run any MDX code you write through this functionality and take its suggestions on board. Mosha walks through an example of using MDX Studio to optimise a calculation on his blog here: http://tinyurl.com/moshaprodvol Block computation versus cell-by-cellWhen the Formula Engine has to evaluate an MDX expression for a query it can basically do so in one of two ways. It can evaluate the expression for each cell returned by the query, one at a time, an evaluation mode known as "cell-by-cell"; or it can try to analyse the calculations required for the whole query and find situations where the same expression would need to be calculated for multiple cells and instead do it only once, an evaluation mode known variously as "block computation" or "bulk evaluation". Block computation is only possible in some situations, depending on how the code is written, but is often many times more efficient than cell-by-cell mode. As a result, we want to write MDX code in such a way that the Formula Engine can use block computation as much as possible, and when we talk about using efficient MDX functions or constructs then this is what we in fact mean. Given that different calculations in the same query, and different expressions within the same calculation, can be evaluated using block computation and cell-by-cell mode, it’s very difficult to know which mode is used when. Indeed in some cases Analysis Services can’t use block mode anyway, so it’s hard know whether we have written our MDX in the most efficient way possible. One of the few indicators we have is the Perfmon counter MDXTotal Cells Calculated, which basically returns the number of cells in a query that were calculated in cell-by-cell mode; if a change to your MDX increments this value by a smaller amount than before, and the query runs faster, you're doing something right. The list of rules that MDX Studio applies is too long to list here, and in any case it is liable to change in future service packs or versions; another good guide for Analysis Services 2008 best practices exists in the Books Online topic Performance Improvements for MDX in SQL Server 2008 Analysis Services, available online here: http://tinyurl.com/mdximp. However, there are a few general rules that are worth highlighting: Don't use the Non_Empty_Behavior calculation property in Analysis Services 2008, unless you really know how to set it and are sure that it will provide a performance benefit. It was widely misused with Analysis Services 2005 and most of the work that went into the Formula Engine for Analysis Services 2008 was to ensure that it wouldn't need to be set for most calculations. This is something that needs to be checked if you're migrating an Analysis Services 2005 cube to 2008. Never use late binding functions such as LookupCube, or StrToMember or StrToSet without the Constrained flag, inside calculations since they have a serious negative impact on performance. It's almost always possible to rewrite calculations so they don't need to be used; in fact, the only valid use for StrToMember or StrToSet in production code is when using MDX parameters. The LinkMember function suffers from a similar problem but is less easy to avoid using it. Use the NonEmpty function wherever possible; it can be much more efficient than using the Filter function or other methods. Never use NonEmptyCrossjoin either: it's deprecated, and everything you can do with it you can do more easily and reliably with NonEmpty. Lastly, don't assume that whatever worked best for Analysis Services 2000 or 2005 is still best practice for Analysis Services 2008. In general, you should always try to write the simplest MDX code possible initially, and then only change it when you find performance is unacceptable. Many of the tricks that existed to optimise common calculations for earlier versions now perform worse on Analysis Services 2008 than the straightforward approaches they were designed to replace. Caching We've already seen how Analysis Services can cache the values returned in the cells of a query, and how this can have a significant impact on the performance of a query. Both the Formula Engine and the Storage Engine can cache data, but may not be able to do so in all circumstances; similarly, although Analysis Services can share the contents of the cache between users there are several situations where it is unable to do so. Given that in most cubes there will be a lot of overlap in the data that users are querying, caching is a very important factor in the overall performance of the cube and as a result ensuring that as much caching as possible is taking place is a good idea. Formula cache scopes There are three different cache contexts within the Formula Engine, which relate to how long data can be stored within the cache and how that data can be shared between users: Query Context, which means that the results of calculations can only be cached for the lifetime of a single query and so cannot be reused by subsequent queries or by other users. Session Context, which means the results of calculations are cached for the lifetime of a session and can be reused by subsequent queries in the same session by the same user. Global Context, which means the results of calculations are cached until the cache has to be dropped because data in the cube has changed (usually when some form of processing takes place on the server). These cached values can be reused by subsequent queries run by other users as well as the user who ran the original query. Clearly the Global Context is the best from a performance point of view, followed by the Session Context and then the Query Context; Analysis Services will always try to use the Global Context wherever possible, but it is all too easy to accidentally write queries or calculations that force the use of the Session Context or the Query Context. Here's a list of the most important situations when that can happen: If you define any calculations (not including named sets) in the WITH clause of a query, even if you do not use them, then Analysis Services can only use the Query Context (see http://tinyurl.com/chrisfecache for more details). If you define session-scoped calculations but do not define calculations in the WITH clause, then the Session Context must be used. Using a subselect in a query will force the use of the Query Context (see http://tinyurl.com/chrissubfe). Use of the CREATE SUBCUBE statement will force the use of the Session Context. When a user connects to a cube using a role that uses cell security, then the Query Context will be used. When calculations are used that contain non-deterministic functions (functions which could return different results each time they are called), for example the Now() function that returns the system date and time, the Username() function or any Analysis Services stored procedure, then this forces the use of the Query Context. Other scenarios that restrict caching Apart from the restrictions imposed by cache context, there are other scenarios where caching is either turned off or restricted. When arbitrary-shaped sets are used in the WHERE clause of a query, no caching at all can take place in either the Storage Engine or the Formula Engine. An arbitrary-shaped set is a set of tuples that cannot be created by a crossjoin, for example: ({([Customer].[Country].&[Australia], [Product].[Category].&[1]),([Customer].[Country].&[Canada], [Product].[Category].&[3])}) If your users frequently run queries that use arbitrary-shaped sets then this can represent a very serious problem, and you should consider redesigning your cube to avoid it. The following blog entries discuss this problem in more detail: http://tinyurl.com/tkarbsethttp://tinyurl.com/chrisarbset Even within the Global Context, the presence of security can affect the extent to which cache can be shared between users. When dimension security is used the contents of the Formula Engine cache can only be shared between users who are members of roles which have the same permissions. Worse, the contents of the Formula Engine cache cannot be shared between users who are members of roles which use dynamic security at all, even if those users do in fact share the same permissions. Cache warming Since we can expect many of our queries to run instantaneously on a warm cache, and the majority at least to run faster on a warm cache than on a cold cache, it makes sense to preload the cache with data so that when users come to run their queries they will get warm-cache performance. There are two basic ways of doing this, running CREATE CACHE statements and automatically running batches of queries. Create Cache statement The CREATE CACHE statement allows you to load a specified subcube of data into the Storage Engine cache. Here's an example of what it looks like: CREATE CACHE FOR [Adventure Works] AS({[Measures].[Internet Sales Amount]}, [Customer].[Country].[Country].MEMBERS,[Date].[Calendar Year].[Calendar Year].MEMBERS) More detail on this statement can be found here: http://tinyurl.com/createcache CREATE CACHE statements can be added to the MDX Script of the cube so they execute every time the MDX Script is executed, although if the statements take a long time to execute (as they often do) then this might not be a good idea; they can also be run after processing has finished from an Integration Services package using an Execute SQL task or through ASCMD, and this is a much better option because it means you have much more control over when the statements actually execute—you wouldn't want them running every time you cleared the cache, for instance. Running batches of queries The main drawback of the CREATE CACHE statement is that it can only be used to populate the Storage Engine cache, and in many cases it's warming the Formula Engine cache that makes the biggest difference to query performance. The only way to do this is to find a way to automate the execution of large batches of MDX queries (potentially captured by running a Profiler trace while users go about their work) that return the results of calculations and so which will warm the Formula Engine cache. This automation can be done in a number of ways, for example by using the ASCMD command line utility which is part of the sample code for Analysis Services that Microsoft provides (available for download here: http://tinyurl.com/sqlprodsamples); another common option is to use an Integration Services package to run the queries, as described in the following blog entries— http://tinyurl.com/chriscachewarm and http://tinyurl.com/allancachewarm This approach is not without its own problems, though: it can be very difficult to make sure that the queries you're running return all the data you want to load into cache, and even when you have done that, user query patterns change over time so ongoing maintenance of the set of queries is important. Scale-up and scale-out Buying better or more hardware should be your last resort when trying to solve query performance problems: it's expensive and you need to be completely sure that it will indeed improve matters. Adding more memory will increase the space available for caching but nothing else; adding more or faster CPUs will lead to faster queries but you might be better off investing time in building more aggregations or tuning your MDX. Scaling up as much as your hardware budget allows is a good idea, but may have little impact on the performance of individual problem queries unless you badly under-specified your Analysis Services server in the first place. If your query performance degenerates as the number of concurrent users running queries increases, consider scaling-out by implementing what's known as an OLAP farm. This architecture is widely used in large implementations and involves multiple Analysis Services instances on different servers, and using network load balancing to distribute user queries between these servers. Each of these instances needs to have the same database on it and each of these databases must contain exactly the same data in it for queries to be answered consistently. This means that, as the number of concurrent users increases, you can easily add new servers to handle the increased query load. It also has the added advantage of removing a single point of failure, so if one Analysis Services server fails then the others take on its load automatically. Making sure that data is the same across all servers is a complex operation and you have a number of different options for doing this: you can either use the Analysis Services database synchronisation functionality, copy and paste the data from one location to another using a tool like Robocopy, or use the new Analysis Services 2008 shared scalable database functionality. The following white paper from the SQLCat team describes how the first two options can be used to implement a network load-balanced solution for Analysis Services 2005: http://tinyurl.com/ssasnlb. Shared scalable databases have a significant advantage over synchronisation and file-copying in that they don't need to involve any moving of files at all. They can be implemented using the same approach described in the white paper above, but instead of copying the databases between instances you process a database (attached in ReadWrite mode) on one server, detach it from there, and then attach it in ReadOnly mode to one or more user-facing servers for querying while the files themselves stay in one place. You do, however, have to ensure that your disk subsystem does not become a bottleneck as a result. Summary In this article we covered MDX calculation performance and caching, and also how to write MDX to ensure that the Formula Engine works as efficiently as possible. We've also seen how important caching is to overall query performance and what we need to do to ensure that we can cache data as often as possible, and we've discussed how to scale-out Analysis Services using network load balancing to handle large numbers of concurrent users.
Read more
  • 0
  • 0
  • 4964
article-image-customizing-page-management-liferay-portal-52-systems-development
Packt
20 Oct 2009
5 min read
Save for later

Customizing Page Management in Liferay Portal 5.2 Systems Development

Packt
20 Oct 2009
5 min read
Customizing page management with more features The Ext Manage Pages portlet not only clones the out of the box Manage Pages portlet, but it also extends the model and service — supporting customized data, for example, Keywords. We can make these Keywords localized too.   Adding localized feature Liferay portal is designed to handle as many languages as you want to support. By default, it supports up to 22 languages. When a page is loading, the portal will detect the language, pull up the corresponding language file, and display the text in the correct language. We want the Keywords to be localized too. For example, the default language is English (United States) and the localized language is Deutsch (Deutschland). Thus, you have the ability to enter not only the Name and HTML Title in German, but also the Keywords in German. As shown in the following screenshot, when you change the language of the page in German using the language portlet, you will see the entire web site changed to German, including the portlet title and input fields. For example, the title of the portlet now has the Ext Seiteneinstellungen value and the Keywords now become Schlüsselwörter. How do we implement this feature? In other words, how do we customize the language display in the page management? Let's add the localized feature for the Ext Manage Pages portlet. Extending model for locale First of all, we need to extend the model and to implement that model in order to support the localized feature. For the ExtLayout model, let's add the locale method first. Locate the ExtLayout.java file from the com.ext.portlet.layout.model package in the /ext/ext-service/src folder, and open it. Add the following lines before the line } in ExtLayout.java and save it: public String getKeywords(Locale locale);public String getKeywords(String localeLanguageId);public String getKeywords(Locale locale, boolean useDefault);public String getKeywords(String localeLanguageId, boolean useDefault);public void setKeywords(String keywords, Locale locale); As shown in the code above, it adds getting and setting methods for the Keywords field with locale features. Now let's add the implementation for the ExtLayout model: Locate the ExtLayoutImpl.java file from the com.ext.portlet.layout.model.impl package in the /ext/ext-impl/src folder and open it. Add the following lines before the last } in ExtLayoutImpl.java file and save it: public String getKeywords(Locale locale) { String localeLanguageId = LocaleUtil.toLanguageId(locale); return getKeywords(localeLanguageId);}public String getKeywords(String localeLanguageId) { return LocalizationUtil.getLocalization(getKeywords(), localeLanguageId);}public String getKeywords(Locale locale, boolean useDefault) { String localeLanguageId = LocaleUtil.toLanguageId(locale); return getKeywords(localeLanguageId, useDefault);}public String getKeywords(String localeLanguageId, boolean useDefault) { return LocalizationUtil.getLocalization( getKeywords(), localeLanguageId, useDefault);}public void setKeywords(String keywords, Locale locale) { String localeLanguageId = LocaleUtil.toLanguageId(locale); if (Validator.isNotNull(keywords)) { setKeywords(LocalizationUtil.updateLocalization( getKeywords(), "keywords", keywords, localeLanguageId)); } else { setKeywords(LocalizationUtil.removeLocalization( getKeywords(), "keywords", localeLanguageId)); }} As shown in the code above, it adds implementation for get and set methods of the ExtLayout model. Customizing language properties Language files have locale-specific definitions. By default, Language.properties (at /portal/portal-impl/src/content) contains English phrase variations further defined for United States, while Language_de.properties (at /portal/portal-impl/src/content) contains German phrase variations further defined for Germany. In Ext, Language-ext.properties (available at /ext/ext-impl/src/content) contains English phrase variations further defined for United States, while Language-ext_de.properties (should be available at /ext/ext-impl/src/content) contains German phrase variations further defined for Germany. First, let's add a message in Language-ext.properties, by using the following steps: Locate the Language-ext.properties file in the /ext/ext-impl/src/content folder and open it. Add the following line after the line view-reports=View Reports for Books and save it. keywords=Keywords This code specifies the keywords message key with a Keywords value in English: Then we need to add German language feature in Language-ext_de.properties as follows: Create a language file Language-ext_de.properties in the /ext/ext-impl/src/content folder and open it. Add the following lines at the beginning and save it: ## Portlet namesjavax.portlet.title.EXT_1=Berichtejavax.portlet.title.jsp_portlet=JSP Portletjavax.portlet.title.book_reports=Berichte für das Buchjavax.portlet.title.extLayoutManagement=Ext Seiteneinstellungenjavax.portlet.title.extCommunities=Ext Communities## Messagesview-reports=Ansicht-Berichte für Bücherkeywords=Schlüsselwörter## Category titlescategory.book=Buch## Model resourcesmodel.resource.com.ext.portlet.reports.model.ReportsEntry= Buch ## Action namesaction.ADD_BOOK=Fügen Sie Buch hinzu As shown in the code above, it specifies the same keys as that of Language-ext.properties. But all the keys' values were specified in German instead of English. For example, the message keywords has a Schlüsselwörter value in German. In addition, you can set German as the default language and Germany as the default country if it is required. Here are the simple steps to do so: Locate the system-ext.properties file in the /ext/ext-impl/src folder and open it. Add the following lines at the end of system-ext.properties and save it: user.country=DEuser.language=de The code above sets the default locale — the language German (Deutsch) and the country Germany (Deutschland). In general, there are many language files, for example Language-ext.properties and Language-ext_de.properties, and some language files would overwrite others in runtime loading. For example, Languageext_de.properties will overwrite Language-ext.properties when the language is set as German. These are the three simple rules which indicate the priorities of these language files: The ext versions take precedence over the non-ext versions. The language-specific versions, for example _de, take precedence over the non language-specific versions. The location-specific versions, such as -ext_de, take precedence over the non location-specific versions. For instance, the following is a ranking from bottom to top for the German language: Language-ext_de.properties Language_de.properties Language-ext.properties Language.properties
Read more
  • 0
  • 0
  • 1820

article-image-events-oracle-11g-database
Packt
20 Oct 2009
5 min read
Save for later

Events in Oracle 11g Database

Packt
20 Oct 2009
5 min read
Generally, jobs run immediately upon being enabled, or when we call the run_job procedure of the dbms_scheduler package. Many jobs are time-based; they are controlled by a schedule based on some kind of calendar. However, not everything in real life can be controlled by a calendar. Many things need an action on an ad hoc basis, depending on the occurrence of some other thing. This is called event-based scheduling. Events also exist as the outcome of a job. We can define a job to raise an event in several ways—when it ends, or when it ends in an error, or when it does not end within the expected runtime. Let's start with creating job events in order to make job monitoring a lot easier for you. Monitoring job events Most of the time when jobs just do their work as expected, there is not much to monitor. In most cases, the job controller has to fix application-specific problems (for example, sometimes file systems or table spaces get filled up). To make this easier, we can incorporate events. We can make jobs raise events when something unexpected happens, and we can have the Scheduler generate events when a job runs for too long. This gives us tremendous power. We can also use this to make chains a little easier to maintain. Events in chains A chain consists of steps that depend on each other. In many cases, it does not make sense to continue to step 2 when step 1 fails. For example, when a create table fails, why try to load data into the nonexistent table? So it is logical to terminate the job if no other independent steps can be performed. One of the ways to handle this is implementing error steps in the chain. This might be a good idea, but the disadvantage is that this quickly doubles the steps involved in the chain, where most of the steps—hopefully—will not be executed. Another disadvantage is that the chain becomes less maintainable. It's a lot of extra code, and more code (mostly) gives us less oversight. If a job chain has to be terminated because of a failure, using the option of creating an event handler to raise a Scheduler event is recommended instead of adding extra steps that try to tell which error possibly happened. This makes event notification a lot simpler because it's all in separate code and not mixed up with the application code. Another situation is when the application logic has to take care of steps that fail, and has well-defined countermeasures to be executed that make the total outcome of the job a success. An example is a situation that starts with a test for the existence of a fi le. If the test fails, get it by FTP; and if this succeeds, load it into the database. In this case, the first step can fail and go to the step that gets the file. As there is no other action possible when the FTP action fails, this should raise a Scheduler event that triggers—for example—a notification action. The same should happen when the load fails. In other third-party scheduling packages, I have seen these notification actions implemented as part of the chain definitions because they lack a Scheduler event queue. In such packages, messages are sent by mail in extra chain steps. In the Oracle Scheduler, this queue is present and is very useful for us. Compared to 10g, nothing has changed in 11g. An event monitoring package can de-queue from the SCHEDULER$_EVENT_QUEUE variable into a sys.scheduler$_event_info type variable. The definition is shown in the following screenshot: What you can do with an event handler is up to your imagination. The following DB Console screenshot shows the interface that can be used to specify which events to raise: It is easy to generate an event for every possible event listed above and have the handler decide (by the rules defined in tables) what to do. This does sound a little creepy, but it is not very different from having a table that can be used as a lookup for the job found in the event message where—most of the time—a notification mail is sent, or not sent. Sometimes, a user wants to get a message when a job starts running; and most of the time, they want a message when a job ends. In a chain, it is especially important to be able to tell in which step the event happened and what that step was supposed to do. In the event message, only the job name is present and so you have to search a bit to find the name of the step that failed. For this, we can use the LOG_ID to find the step name in the SCHEDULER_JOB_LOGS (user/dba/all_SCHEDULER_JOB_LOG) view, where the step name is listed as JOB_SUBNAME. The following query can be used to find the step_name from the dba all_scheduler_log view, assuming that the event message is received in msg: select job_subname from all_scheduler_job_log wherelog_id = msg.log_id; To enable the delivery of all the events a job can generate, we can set the raise_events attribute to a value of: dbms_scheduler.job_started + dbms_scheduler.job_succeeded +dbms_scheduler.job_failed + dbms_scheduler.job_broken +dbms_scheduler.job_completed + dbms_scheduler.job_stopped +dbms_scheduler.job_sch_lim_reached + dbms_scheduler.job_disabled +dbms_scheduler.job_chain_stalled Or in short, we can set it to: dbms_scheduler.job_all_events. There are many things that can be called events. In the job system, there are basically two types of events: events caused by jobs (which we already discussed) and events that makes a job execute.
Read more
  • 0
  • 0
  • 3301

article-image-building-personal-community-liferay-portal-52
Packt
20 Oct 2009
7 min read
Save for later

Building Personal Community in Liferay Portal 5.2

Packt
20 Oct 2009
7 min read
Besides public web sites, it would be nice if we could provide a personal community, that is, My Community, for each registered user. In My Community, users can have a set of public and private pages. Here you can save your favorite games, videos and playlists, and the My Street theme — your background color. As shown in the following screenshot, there is a page my_street with a portlet myStreet, where an end user can sign up, log in, or handle an issue such as Forgot Your Password? o When you click on the SIGN UP button, this My Street portlet will allow the end users to set up his/her account such as creating a nickname, password hint question, password hint answer, your password, and verify password. Further, as the end user, you can set up favorite games, videos and playlists, and background color. You can have your own favorites: number of favorites displayed in My Street (for example, 4, 20, and 35) and My Street theme, (for example, default, Abby Cadabby, Bert, Big Bird, Cookie Monster, and so on). A set of default My Street themes is predefined in /cms_services/images/my_street/ under the folder $CATALINA_HOME/webapps/ and you can choose any one of them at any time. At the same time, you can upload a photo to your own Book Street web page. When logged in, the My Street theme will be applied on Home page, Games landing page, Videos landing page, and Playlist landing page. For example, current user's My Street theme could be applied on Playlist landing page. You may play videos, games, and playlists when you visit the web site www.bookpubstreet.com. When you find your favorite videos, games, and playlists, you can add them into My Street. As shown in the following screenshot, you could be playing a playlist Tickle Time. If you are interested in this playlist, just click on the Add to My Street button. The playlist Tickle Time will be added into My Street as your favorite playlist. In this section, we will show how to implement these features. Customizing user model First, let's customize user model in service.xml in order to support extended user and user preferences. To do so, use the following steps: Create a package com.ext.portlet.user in the /ext/ext-impl/src folder. Create an XML file service.xml in the package com.ext.portlet.comment and open it. Add the following lines in this file and save it: <?xml version="1.0"?><!DOCTYPE service-builder PUBLIC "-//Liferay//DTD Service Builder 5.2.0//EN" "http://www.liferay.com/dtd/liferay-service-builder_5_2_0.dtd"><service-builder package-path="com.ext.portlet.user"> <namespace>ExtUser</namespace> <entity name="ExtUser" uuid="false" local-service="true" remote-service="true" persistence-class="com.ext.portlet.user.service. persistence. ExtUserPersistenceImpl"> <column name="userId" type="long" primary="true" /> <column name="favorites" type="int" /> <column name="theme" type="String" /> <column name="printable" type="boolean" /> <column name="plainPassword" type="String" /> <column name="creator" type="String" /> <column name="modifier" type="String" /> <column name="created" type="Date" /> <column name="modified" type="Date" /> </entity> <entity name="ExtUserPreference" uuid="false" local-service="true" remote-service="true" persistence-class="com.ext.portlet.user.service. persistence.ExtUserPreferencePersistenceImpl"> <column name="userPreferenceId" type="long" primary="true" /> <column name="userId" type="long"/> <column name="favorite" type="String" /> <column name="favoriteType" type="String" /> <column name="date_" type="Date" /> </entity> <exceptions> <exception>ExtUser</exception> <exception>ExtUserPreference</exception> </exceptions></service-builder> The code above shows the customized user model, including userId associated with USER_ table, favourites, theme, printable, plainPassword, and so on. This code also shows user preferences model, including userPreferenceId, userId, favorite (for example, game/video/playlist UID), favoriteType (for example, game/video/playlist), and the updated date date_. Of course, these models are extensible. You can extend these models for your unique current needs or future requirements. Enter the following tables into database through command prompt: create table ExtUser ( userId bigint not null primary key, creator varchar(125), modifier varchar(125), created datetime null,modified datetime null, favorites smallint, theme varchar(125), plainPassword varchar(125), printable boolean ); create table ExtUserPreference ( usePreferenceId bigint not null primary key, userId bigint not null,favorite varchar(125), favoriteType varchar(125),date_ datetime null ); The preceding code shows the database SQL script for customized user model and user preference model. Similar to XML model ExtUser, this code shows the userId, favorites, theme, plainPassword, and printable table fields. Again, for the XML model ExtUserPreference, this code shows the userId, favorite, favoriteType, date_, and userPreferenceId table fields. Afterwards, we need to build a service with ServiceBuilder. After preparing service.xml, you can build services. To do so, locate the XML file /ext/ext-impl/buildparent.xml, open it, add the following lines between </target> and <target name="build-service-portlet-reports">, and save it: <target name="build-service-portlet-extUser"> <antcall target="build-service"> <param name="service.file" value="src/com/ext/portlet/user/service.xml" /> </antcall></target> When you are ready, just double-click on the Ant target build-service-portletextUser. ServiceBuilder will build related models and services for extUser and extUserPreference.   Building the portlet My Street Similar to how we had built portlet Ext Comment, we can build the portlet My Street as follows: Configure the portlet My Street in both portlet-ext.xml and liferay-portlet-ext.xml files. Set title mapping in the Language-ext.properties file. Add the My Street portlet to the Book category in the liferay-display.xml file. Finally, specify Struts actions and forward paths in the struts-config.xml and tiles-defs.xml files, respectively. Then, we need to create Struts actions as follows: Create a package com.ext.portlet.my_street.action in the folder /ext/ext-impl/src. Add the Java files ViewAction.java, EditUserAction.java, and CreateAccountAction.java in this package. Create a Java file AddUserLocalServiceUtil.java in this package and open it. Add the following methods in AddUserLocalServiceUtil.java and save it: public static ExtUser getUser(long userId){ ExtUser user = null; try{ user = ExtUserLocalServiceUtil.getExtUser(userId); } catch (Exception e){} if(user == null){ user = ExtUserLocalServiceUtil.createExtUser(userId); try{ ExtUserLocalServiceUtil.updateExtUser(user); } catch (Exception e) {} } return user;}public static void deleteUser(long userId) { try{ ExtUserLocalServiceUtil.deleteExtUser(userId); } catch (Exception e){} }public static void updateUser(ActionRequest actionRequest, long userId) { /* ignore details */}public static List<ExtUserPreference> getUserPreferences( long userId, int limit){ /* ignore details */}public static ExtUserPreference addUserPreference( long userId, String favorite, String favoriteType){ /* ignore details */ } As shown in the code above, it shows methods to get ExtUser and ExtUserPreference, to delete ExtUser, and to add and update ExtUser and ExtUserPreference. In addition, we need to provide default values for private page and friendly URL in portal-ext.properties as follows: ext.default_page.my_street.private_page=falseext.default_page.my_street.friend_url=/my_street The code above shows the default private page of my_street as false, default friendly URL of my_street as /my_street. Therefore, you can use VM service to generate a URL in order to add videos, games, and playlists into My Street. Adding Struts view page Now we need to build Struts views pages view.jsp, forget_password.jsp, create_account.jsp, congratulation.jsp, and congrates_uid.jsp. The following are main steps to do so: Create a folder my_street in /ext/ext-web/docroot/html/portlet/ext/. Create JSP file pages view.jsp, view_password.jsp, userQuestions.jsp, edit_account.jsp, create_account.jsp, congratulation.jsp, and congrates_uid.jsp in /ext/ext-web/docroot/html/portlet/ext/my_street/. Note that congrates_uid.jsp is used for the pop-up congratulation of Add to My Street. When you click on the Add to My Street button, a window with congrates_uid.jsp will pop up. userQuestions.jsp is used when the user has forgotten the password of My Street, view_password.jsp is for general view of My Street, congratulation.jsp is used to represent successful information after creating user account, and edit_account.jsp is used to create/update user account.
Read more
  • 0
  • 0
  • 1418
article-image-oracle-wallet-manager
Packt
16 Oct 2009
9 min read
Save for later

Oracle Wallet Manager

Packt
16 Oct 2009
9 min read
  The Oracle Wallet Manager Oracle Wallet Manager is a password protected stand-alone Java application tool used to maintain security credentials and store SSL related information such as authentication and signing credentials, private keys, certificates, and trusted certificates. OWM uses Public Key Cryptographic Standards (PKCS) #12 specification for the Wallet format and PKCS #10 for certificate requests. Oracle Wallet Manager stores X.509 v3 certificates and private keys in industry-standard PKCS #12 formats, and generates certificate requests according to the PKCS #10 specification. This makes the Oracle Wallet structure interoperable with supported third party PKI applications, and provides Wallet portability across operating systems. Additionally, Oracle Wallet Manager Wallets can be enabled to store credentials on hardware security modules that use APIs compliant with the PKCS #11 specification. The OWM creates Wallets, generates certificate requests, accesses Public Key interface-based services, saves credentials into cryptographic hardware such as smart cards, uploads and unloads Wallets to LDAP directories, and imports Wallets in PKCS #12 format. In a Windows environment, Oracle Wallet Manager can be accessed from the start menu. The following screenshot shows the Oracle Wallet Manager Properties: In a Unix like environment, OWM can be accessed directly from the command line with the owm shell script located at $ORACLE_HOME/bin/owm, it requires a graphical environment so it can be launched. Creating the Oracle Wallet If this is the first time the Wallet has been opened, then a Wallet file does not yet exist. A Wallet is physically created in a specified directory. The user can declare the path where the Oracle Wallet file should be created. The user may either specify a default location or declare a particular directory. A file named ewallet.p12 will be created in the specified location. Enabling Auto Login The Oracle Wallet Manager Auto Login feature creates an obfuscated copy of the Wallet and enables PKI-based access to the services without a password. When this feature is enabled, only the user who created the Wallet will have access to it. By default, Single Sign-On (SSO) access to a different database is disabled. The auto login feature must be enabled in order for you to have access to multiple databases using SSO. Checking and unchecking the Auto Login option will enable and disable this feature. mkwallet, the CLI OWM version Besides the Java client, there is a command line interface version of the Wallet, which can be accessed by means of the mkwallet utility. This can also be used to generate a Wallet and have it configured in Auto Login mode. This is a fully featured tool that allows you to create Wallets, and to view and modify their content. The options provided by the mkwallet tool are shown in the following table:     Option Meaning -R rootPwd rootWrl DN keySize expDate Create the root Wallet -e pwd wrl Create an empty Wallet -r pwd wrl DN keySize certReqLoc Create a certificate request, add it to Wallet and export it to certReqLoc -c rootPwd rootWrl certReqLoc certLoc Create a certificate for a certificate request -i pwd wrl certLoc NZDST_CERTIFICATE | NZDST_CLEAR_PTP Install a certificate | trusted point -d pwd wrl DN Delete a certificate with matching DN -s pwd wrl Store sso Wallet -p pwd wrl Dump the content of Wallet -q certLoc Dump the content of the certificate -Lg pwd wrl crlLoc nextUpdate Generate CRL -La pwd wrl crlLoc certtoRevoke Revoke certificate -Ld crlLoc Display CRL -Lv crlLoc cacert Verify CRL signature -Ls crlLoc cert Check certificate revocation status -Ll oidHostname oidPortNumber cacert Fetch CRL from LDAP directory -Lc cert Fetch CRL from CRLDP in cert -Lb b64CrlLoc derCrlLoc Convert CRL from B64 to DER format -Pw pwd wrl pkcs11Lib tokenPassphrase Create an empty Wallet. Store PKCS11 info in it. -Pq pwd wrl DN keysize certreqLoc Create cert request. Generate key pair on pkcs11 device. -Pl pwd wrl Test pkcs11 device login using Wallet containing PKCS11 info. -Px pwd wrl pkcs11Lib tokenPassphrase Create a Wallet with pkcs11 info from a software Wallet.   Managing Wallets with orapki A CLI-based tool, orapki, is used to manage Public Key Infrastructure components such as Wallets and revocation lists. This tool eases the procedures related to PKI management and maintenance by allowing the user to include it in batch scripts. This tool can be used to create and view signed certificates for testing purposes, create Oracle Wallets, add and remove certificate and certificate requests, and manage Certification Revocation Lists (CRLs)—renaming them and managing them against the Oracle Internet Directory. The syntax for this tool is: orapki module command -parameter <value> module can have these values: wallet: Oracle Wallet crl: Certificate Revocation List cert: The PKI Certificate To create a Wallet you can issue this command: orapki wallet create -wallet <Path to Wallet> To create a Wallet with the auto login feature enabled, you can issue the command: orapki wallet create -wallet <Path to Wallet> -autologin To add a certificate request to the Wallet you can use the command: orapki wallet add -wallet <wallet_location> -dn <user_dn> -keySize <512|1024|2048> To add a user certificate to an Oracle Wallet: orapki wallet add -wallet <wallet_location> -user_cert -cert <certificate_location> The options and values available for the orapki tool depend on the module to be configured: orapki Action Description and Syntax orapki cert create Creates a signed certificate for testing purposes. orapki cert create [-wallet <wallet_location>] -request <certificate_request_location> -cert <certificate_location> -validity <number_of_days> [-summary] orapki cert display Displays details of a specific certificate. orapki cert display -cert <certificate_location> [-summary|-complete] orapki crl delete Deletes CRLs from Oracle Internet Directory.   orapki crl delete -issuer <issuer_name> -ldap <hostname: ssl_port> -user <username> [-wallet <wallet_location>] [-summary] orapki crl diskplay Displays specific CRLs that are stored in Oracle Internet Directory. orapki crl display -crl <crl_location> [-wallet <wallet_location>] [-summary|-complete] orapki crl hash Generates a hash value of the certificate revocation list (CRL) issuer to identify the location of the CRL in your file system for certificate validation. orapki crl hash -crl <crl_filename|URL> [-wallet <wallet_location>] [-symlink|-copy] <crl_directory> [-summary] orapki crl list Displays a list of CRLs stored in Oracle Internet Directory. orapki crl list -ldap <hostname:ssl_port> orapki crl upload Uploads CRLs to the CRL subtree in Oracle Internet Directory. orapki crl upload -crl <crl_location> -ldap <hostname:ssl_port> -user <username> [-wallet <wallet_location>] [-summary] orapki wallet add Add certificate requests and certificates to an Oracle Wallet. orapki wallet add -wallet <wallet_location> -dn <user_dn> -keySize <512|1024|2048> orapki wallet create Creates an Oracle Wallet or to set auto login on for an Oracle Wallet. orapki wallet create -wallet <wallet_location> [-auto_login] orapki wallet display Displays the certificate requests, user certificates, and trusted certificates in an Oracle Wallet. orapki wallet display -wallet <wallet_location> orapki wallet export Export certificate requests and certificates from an Oracle Wallet. orapki wallet export -wallet <wallet_location> -dn <certificate_dn> -cert <certificate_filename>   Oracle Wallet Manager CSR generation Oracle Wallet Manager generates a certificate request in PKCS #10 format. This certificate request can be sent to a certificate authority of your choice. The procedure to generate this certificate request is as follows: From the main menu choose the Operations menu and then select the Add Certificate Request submenu. As shown in the following screenshot, a form will be displayed where you can capture specific information. The parameters used to request a certificate are described next: Common Name: This parameter is mandatory. This is the user's name or entity's name. If you are using a user's name, then enter it using the first name, last name format. Organization Unit: This is the name of the identity's organization unit. It could be the name of the department where the entity belongs (optional parameter). Organization: This is the company's name (optional). Location/City: The location and the city where the entity resides (optional). State/Province: This is the full name of the state where the entity resides. Do not use abbreviations (optional). Country: This parameter is mandatory. It specifies the country where the entity is located. Key Size: This parameter is mandatory. It defines the key size used when a public/private key pair is created. The key size can be as little as 512 bytes and up to 4096 bytes. Advanced: When the parameters are introduced a Distinguished Name (DN) is assembled. If you want to customize this DN, then you can use the advanced DN configuration mode. Once the Certificate Request form has been completed, a PKCS#10 format certificate request is generated. The information that appears between the BEGIN and END keywords must be used to request a certificate to a Certificate Authority (CA); there are several well known certificate authorities, and depending on the usage you plan for your certificate, you could address the request to a known CA (from the browser perspective) so when an end user accesses your site it doesn't get warned about the site's identity. If the certificate will be targeted at a local community who doesn't mind about the certificate warning, then you may generate your own certificate or ask a CA to issue a certificate for you. For demonstration purposes, we used the Oracle Certificate Authority (OCA) included with the Oracle Application Server. OCA will provide the Certificate Authority capabilities to your site and it can issue standard certificates, suitable for the intranet users. If you are planning to use OCA then you should review the license agreements to determine if you are allowed to use it.  
Read more
  • 0
  • 0
  • 5564

article-image-databasedata-model-round-trip-engineering-mysql
Packt
15 Oct 2009
3 min read
Save for later

Database/Data Model Round-Trip Engineering with MySQL

Packt
15 Oct 2009
3 min read
Power*Architect—from SQL Power—is a free software data modeling tool, which you can download from its website www.sqlpower.ca and use it under GPLv3 license. Reverse Engineering To reverse engineer is to create the data model of an existing database. To reverse engineer an existing database in Power*Architect, we need to connect to the database. Figure 1 shows the Power*Architect's connection window where we define (create) our connection to the MySQL sales database that we'd like to reengineer. Figure 1: Creating a database connection By adding the conn_packt connection, the sales database objects are now available in Power*Architect. Figure 2: Adding a database connection By expanding the sales database, you can see all the objects that you need to create its data model. Figure 3: Database objects You create the ER diagram of the sales data model by dragging the sales object into the canvas (called playpen in Power*Architect) Note that the objects in the model (those in the diagram) are now in the PlayPen Database. Figure 4: Database objects in the PlayPen Now that you have created the data model, you might want to save it. Figure 5: Saving the data model (project) Figure 6: Saving sales.architect data model (project) You have completed the sales database reverse-engineering. Updating the Data Model Let's now add two new tables (hardware and software) and relate them to the product table. You add a table by clicking the New Table tool and dropping your cursor on the white space of the canvas. Figure 7: New Table tool Type in the name of the table, and then click OK. Figure 8: Adding hardware table We now add a column to the hardware table by right-clicking the table and selecting New Column. Figure 9: New Column menu selection Type in the name of the column (model), select VARCHAR data type (and its length), then click OK. Figure 10: The model column After adding the two tables and their columns, our ER diagram will look like in Figure 11. Figure 11: The hardware and software tables Our last update is relating the hardware and software tables to the product table. Select the New Identifying Relationship tool; click it to the product and then the software. Figure 12: New Identifying Relationship tool The software table is now related to the product table. Note that the product's primary key is migrated to the software table as a primary key. Figure 13: software and product tables are related
Read more
  • 0
  • 0
  • 2099