Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

How-To Tutorials - Data

1204 Articles
article-image-creating-cartesian-based-graphs
Packt
17 Jan 2013
19 min read
Save for later

Creating Cartesian-based Graphs

Packt
17 Jan 2013
19 min read
(For more resources related to this topic, see here.) Introduction Our first graph/chart under the microscope is the most popular and simplest one to create. We can classify them all roughly under Cartesian-based graphs. Altogether this graph style is relatively simple; it opens the door to creating amazingly creative ways of exploring data. In this article we will lay down the foundations to building charts in general and hopefully motivate you to come up with your own ideas on how to create engaging data visualizations. Building a bar chart from scratch The simplest chart around is the one that holds only one dimensional data (only one value per type). There are many ways to showcase this type of data but the most popular, logical, and simple way is by creating a simple bar chart. The steps involved in creating this bar chart will be very similar even in very complex charts. The ideal usage of this type of chart is when the main goal is to showcase simple data, as follows: Getting ready Create a basic HTML file that contains a canvas and an onLoad event that will trigger the init function. Load the 03.01.bar.js script. We will create the content of the JavaScript file in our recipe as follows: <!DOCTYPE html> <html> <head> <title>Bar Chart</title> <meta charset="utf-8" /> <script src="03.01.bar.js"></script> </head> <body onLoad="init();" style="background:#fafafa"> <h1>How many cats do they have?</h1> <canvas id="bar" width="550" height="400"> </canvas> </body> </html> Creating a graph in general has three steps: defining the work area, defining the data sources, and then drawing in the data. How to do it... In our first case, we will compare a group of friends and how many cats they each own. We will be performing the following steps: Define your data set: var data = [{label:"David", value:3, style:"rgba(241, 178, 225, 0.5)"}, {label:"Ben", value:2, style:"#B1DDF3"}, {label:"Oren", value:9, style:"#FFDE89"}, {label:"Barbera", value:6, style:"#E3675C"}, {label:"Belann", value:10, style:"#C2D985"}]; For this example I've created an array that can contain an unlimited number of elements. Each element contains three values: a label, a value, and a style for its fill color. Define your graph outlines. Now that we have a data source, it's time to create our basic canvas information, which we create in each sample: var can = document.getElementById("bar"); var wid = can.width; var hei = can.height; var context = can.getContext("2d"); context.fillStyle = "#eeeeee"; context.strokeStyle = "#999999"; context.fillRect(0,0,wid,hei); The next step is to define our chart outlines: var CHART_PADDING = 20; context.font = "12pt Verdana, sans-serif"; context.fillStyle = "#999999"; context.moveTo(CHART_PADDING,CHART_PADDING); context.lineTo(CHART_PADDING,hei-CHART_PADDING); context.lineTo(wid-CHART_PADDING,hei-CHART_PADDING); var stepSize = (hei - CHART_PADDING*2)/10; for(var i=0; i<10; i++){ context.moveTo(CHART_PADDING, CHART_PADDING + i* stepSize); context.lineTo(CHART_PADDING*1.3,CHART_PADDING + i* stepSize); context.fillText(10-i, CHART_PADDING*1.5, CHART_PADDING + i* stepSize + 6); } context.stroke(); Our next and final step is to create the actual data bars: var elementWidth =(wid-CHART_PADDING*2)/ data.length; context.textAlign = "center"; for(i=0; i<data.length; i++){ context.fillStyle = data[i].style; context.fillRect(CHART_PADDING +elementWidth*i ,hei- CHART_PADDING - data[i].value*stepSize,elementWidth,data[i]. value*stepSize); context.fillStyle = "rgba(255, 255, 225, 0.8)"; context.fillText(data[i].label, CHART_PADDING +elementWidth*(i+.5), hei-CHART_PADDING*1.5); } That's it. Now, if you run the application in your browser, you will find a bar chart rendered. How it works... I've created a variable called CHART_PADDING that is used throughout the code to help me position elements (the variable is in uppercase because I want it to be a constant; so it's to remind myself that this is not a value that will change in the lifetime of the application). Let's delve deeper into the sample we created starting from our outline area: context.moveTo(CHART_PADDING,CHART_PADDING); context.lineTo(CHART_PADDING,hei-CHART_PADDING); context.lineTo(wid-CHART_PADDING,hei-CHART_PADDING); In these lines we are creating the L-shaped frame for our data; this is just to help and provide a visual aid. The next step is to define the number of steps that we will use to represent the numeric data visually. var stepSize = (hei - CHART_PADDING*2)/10; In our sample we are hardcoding all of the data. So in the step size we are finding the total height of our chart (the height of our canvas minus our padding at the top and bottom), which we then divide by the number of the steps that will be used in the following for loop: for(var i=0; i<10; i++){ context.moveTo(CHART_PADDING, CHART_PADDING + i* stepSize); context.lineTo(CHART_PADDING*1.3,CHART_PADDING + i* stepSize); context.fillText(10-i, CHART_PADDING*1.5, CHART_PADDING + i* stepSize + 6); } We loop through 10 times going through each step to draw a short line. We then add numeric information using the fillText method. Notice that we are sending in the value 10-i. This value works well for us as we want the top value to be 10. We are starting at the top value of the chart; we want the displayed value to be 10 and as the value of i increases, we want our value to get smaller as we move down the vertical line in each step of the loop. Next we want to define the width of each bar. In our case, we want the bars to touch each other and to do that we will take the total space available, and divide it by the number of data elements. var elementWidth =(wid-CHART_PADDING*2)/ data.length; At this stage we are ready to draw the bar but before we do that, we should calculate the width of the bars. We then loop through all the data we have and create the bars: context.fillStyle = data[i].style; context.fillRect(CHART_PADDING +elementWidth*i ,hei-CHART_PADDING - data[i].value*stepSize,elementWidth,data[i].value*stepSize); context.fillStyle = "rgba(255, 255, 225, 0.8)"; Notice that we are resetting the style twice each time the loop runs. If we didn't, we wouldn't get the colors we are hoping to get. We then place our text in the middle of the bar that was created. context.textAlign = "center"; There's more... In our example, we created a non-flexible bar chart, and if this is the way we create charts we will need to recreate them from scratch each time. Let's revisit our code and tweak it to make it more reusable. Revisiting the code Although everything is working exactly as we want it to work, if we played around with the values, it would stop working. For example, what if I only wanted to have five steps; if we go back to our code, we will locate the following lines: var stepSize = (hei - CHART_PADDING*2)/10; for(var i=0; i<10; i++){ We can tweak it to handle five steps: var stepSize = (hei - CHART_PADDING*2)5; for(var i=0; i<5; i++){ We would very quickly find out that our application is not working as expected. To solve this problem let's create a new function that will deal with creating the outlines of the chart. Before we do that, let's extract the data object and create a new object that will contain the steps. Let's move the data and format it in an accessible format: var data = [...];var chartYData = [{label:"10 cats", value:1}, {label:"5 cats", value:.5}, {label:"3 cats", value:.3}];var range = {min:0, max:10};var CHART_PADDING = 20;var wid;var hei;function init(){ Take a deep look into chartYData object as it enables us to put in as many steps as we want without a defined spacing rule and the range object that will store the minimum and maximum values of the overall graph. Before creating the new functions, let's add them into our init function (changes marked in bold). function init(){var can = document.getElementById("bar");wid = can.width;hei = can.height;var context = can.getContext("2d");context.fillStyle = "#eeeeee";context.strokeStyle = "#999999";context.fillRect(0,0,wid,hei);context.font = "12pt Verdana, sans-serif";context.fillStyle = "#999999";context.moveTo(CHART_PADDING,CHART_PADDING);context.lineTo(CHART_PADDING,hei-CHART_PADDING);context.lineTo(wid-CHART_PADDING,hei-CHART_PADDING);fillChart(context,chartYData);createBars(context,data);} All we did in this code is to extract the creation of the chart and its bars into two separate functions. Now that we have an external data source both for the chart data and the content, we can build up their logic. Using the fillChart function The fillChart function's main goal is to create the foundation of the chart. We are integrating our new stepData object information and building up the chart based on its information. function fillChart(context, stepsData){ var steps = stepsData.length; var startY = CHART_PADDING; var endY = hei-CHART_PADDING; var chartHeight = endY-startY; var currentY; var rangeLength = range.max-range.min; for(var i=0; i<steps; i++){ currentY = startY + (1-(stepsData[i].value/rangeLength)) * chartHeight; context.moveTo(CHART_PADDING, currentY ); context.lineTo(CHART_PADDING*1.3,currentY); context.fillText(stepsData[i].label, CHART_PADDING*1.5, currentY+6); } context.stroke(); } Our changes were not many, but with them we turned our function to be much more dynamic than it was before. This time around we are basing the positions on the stepsData objects and the range length that is based on that. Using the createBars function Our next step is to revisit the createBars area and update the information so it can be created dynamically using external objects. function createBars(context,data){ var elementWidth =(wid-CHART_PADDING*2)/ data.length; var startY = CHART_PADDING; var endY = hei-CHART_PADDING; var chartHeight = endY-startY; var rangeLength = range.max-range.min; var stepSize = chartHeight/rangeLength; context.textAlign = "center"; for(i=0; i<data.length; i++){ context.fillStyle = data[i].style; context.fillRect(CHART_PADDING +elementWidth*i ,hei- CHART_PADDING - data[i].value*stepSize,elementWidth,data[i].value*stepSize); context.fillStyle = "rgba(255, 255, 225, 0.8)"; context.fillText(data[i].label, CHART_PADDING +elementWidth*(i+.5), hei-CHART_PADDING*1.5); } } Almost nothing changed here apart from a few changes in the way we positioned the data and extracted hardcoded values. Spreading data in a scatter chart The scatter chart is a very powerful chart and is mainly used to get a bird's-eye view while comparing two data sets. For example, comparing the scores in an English class and the scores in a Math class to find a correlative relationship. This style of visual comparison can help find surprising relationships between unexpected data sets. This is ideal when the goal is to show a lot of details in a very visual way. Getting ready If you haven't had a chance yet to scan through the logic of our first section in this article, I recommend you take a peek at it as we are going to base a lot of our work on that while expanding and making it a bit more complex to accommodate two data sets. I've revisited our data source from the previous section and modified it to store three variables of students' exam scores in Math, English, and Art. var data = [{label:"David",math:50,english:80,art:92,style:"rgba(241, 178, 225, 0.5)"},{label:"Ben",math:80,english:60,art:43,style:"#B1DDF3"},{label:"Oren",math:70,english:20,art:92,style:"#FFDE89"},{label:"Barbera",math:90,english:55,art:81,style:"#E3675C"},{label:"Belann",math:50,english:50,art:50,style:"#C2D985"}]; Notice that this data is totally random so we can't learn anything from the data itself; but we can learn a lot about how to get our chart ready for real data. We removed the value attribute and instead replaced it with math, english, and art attributes. How to do it... Let's dive right into the JavaScript file and the changes we want to make: Define the y space and x space. To do that, we will create a helper object that will store the required information: var chartInfo= { y:{min:40, max:100, steps:5,label:"math"}, x:{min:40, max:100, steps:4,label:"english"} }; It's time for us to set up our other global variables and start up our init function: var CHART_PADDING = 30;var wid;var hei;function init(){var can = document.getElementById("bar");wid = can.width;hei = can.height;var context = can.getContext("2d");context.fillStyle = "#eeeeee";context.strokeStyle = "#999999";context.fillRect(0,0,wid,hei);context.font = "10pt Verdana, sans-serif";context.fillStyle = "#999999";context.moveTo(CHART_PADDING,CHART_PADDING);context.lineTo(CHART_PADDING,hei-CHART_PADDING);context.lineTo(wid-CHART_PADDING,hei-CHART_PADDING);fillChart(context,chartInfo);createDots(context,data);} Not much is new here. The major changes are highlighted. Let's get on and start creating our fillChart and createDots functions. If you worked on our previous section, you might notice that there are a lot of similarities between the functions in the previous section and this function. I've deliberately changed the way we create things just to make them more interesting. We are now dealing with two data points as well, so many details have changed. Let's review them: function fillChart(context, chartInfo){ var yData = chartInfo.y; var steps = yData.steps; var startY = CHART_PADDING; var endY = hei-CHART_PADDING; var chartHeight = endY-startY; var currentY; var rangeLength = yData.max-yData.min; var stepSize = rangeLength/steps; context.textAlign = "left"; for(var i=0; i<steps; i++){ currentY = startY + (i/steps) * chartHeight; context.moveTo(wid-CHART_PADDING, currentY ); context.lineTo(CHART_PADDING,currentY); context.fillText(yData.min+stepSize*(steps-i), 0, currentY+4); } currentY = startY + chartHeight; context.moveTo(CHART_PADDING, currentY ); context.lineTo(CHART_PADDING/2,currentY); context.fillText(yData.min, 0, currentY-3); var xData = chartInfo.x; steps = xData.steps; var startX = CHART_PADDING; var endX = wid-CHART_PADDING; var chartWidth = endX-startX; var currentX; rangeLength = xData.max-xData.min; stepSize = rangeLength/steps; context.textAlign = "left"; for(var i=0; i<steps; i++){ currentX = startX + (i/steps) * chartWidth; context.moveTo(currentX, startY ); context.lineTo(currentX,endY); context.fillText(xData.min+stepSize*(i), currentX-6, endY+CHART_PADDING/2); } currentX = startX + chartWidth; context.moveTo(currentX, startY ); context.lineTo(currentX,endY); context.fillText(xData.max, currentX-3, endY+CHART_PADDING/2); context.stroke(); } When you review this code you will notice that our logic is almost duplicated twice. While in the first loop and first batch of variables we are figuring out the positions of each element in the y space, we move on in the second half of this function to calculate the layout for the x area. The y axis in canvas grows from top to bottom (top lower, bottom higher) and as such we need to calculate the height of the full graph and then subtract the value to find positions. Our last function is to render the data points and to do that we create the createDots function: function createDots(context,data){ var yDataLabel = chartInfo.y.label; var xDataLabel = chartInfo.x.label; var yDataRange = chartInfo.y.max-chartInfo.y.min; var xDataRange = chartInfo.x.max-chartInfo.x.min; var chartHeight = hei- CHART_PADDING*2; var chartWidth = wid- CHART_PADDING*2; var yPos; var xPos; for(var i=0; i<data.length;i++){ xPos = CHART_PADDING + (data[i][xDataLabel]-chartInfo.x.min)/ xDataRange * chartWidth; yPos = (hei - CHART_PADDING) -(data[i][yDataLabel]- chartInfo.y.min)/yDataRange * chartHeight; context.fillStyle = data[i].style; context.fillRect(xPos-4 ,yPos-4,8,8); } } Here we are figuring out the same details for each point—both the y position and the x position—and then we draw a rectangle. Let's test our application now! How it works... We start by creating a new chartInfo object: var chartInfo= { y:{min:40, max:100, steps:5,label:"math"}, x:{min:40, max:100, steps:4,label:"english"} }; This very simple object encapsulates the rules that will define what our chart will actually output. Looking closely you will see that we set an object named chartInfo that has information on the y and x axes. We have a minimum value ( min property), maximum value ( max property), and the number of steps we want to have in our chart ( steps property), and we define a label. Let's look deeper into the way the fillChart function works. In essence we have two numeric values; one is the actual space on the screen and the other is the value the space represents. To match these values we need to know what our data range is and also what our view range is, so we first start by finding our startY point and our endY point followed by calculating the number of pixels between these two points: var startY = CHART_PADDING; var endY = hei-CHART_PADDING; var chartHeight = endY-startY; These values will be used when we try to figure out where to place the data from the chartInfo object. As we are already speaking about that object, let's look at what we do with it: var yData = chartInfo.y; var steps = yData.steps; var rangeLength = yData.max-yData.min; var stepSize = rangeLength/steps; As our focus right now is on the height, we are looking deeper into the y property and for the sake of comfort we will call it yData. Now that we are focused on this object, it's time to figure out what is the actual data range (rangeLength) of this value, which will be our converter number. In other words we want to take a visual space between the points startY and endY and based on the the range, position it in this space. When we do so we can convert any data into a range between 0-1 and then position them in a dynamic visible area. Last but not least, as our new data object contains the number of steps we want to add into the chart, we use that data to define the step value. In this example it would be 12. The way we get to this value is by taking our rangeLength (100 - 40 = 60) value and then dividing it by the number of steps (in our case 5). Now that we have got the critical variables out of the way, it's time to loop through the data and draw our chart: var currentY; context.textAlign = "left"; for(var i=0; i<steps; i++){ currentY = startY + (i/steps) * chartHeight; context.moveTo(wid-CHART_PADDING, currentY ); context.lineTo(CHART_PADDING,currentY); context.fillText(yData.min+stepSize*(steps-i), 0, currentY+4); } This is where the magic comes to life. We run through the number of steps and then calculate the new Y position again. If we break it down we will see: currentY = startY + (i/steps) * chartHeight; We start from the start position of our chart (upper area) and then we add to it the steps by taking the current i position and dividing it by the total possible steps (0/5, 1/5, 2/5 and so on). In our demo it's 5, but it can be any value and should be inserted into the chartInfo steps attribute. We multiply the returned value by the height of our chart calculated earlier. To compensate for the fact that we started from the top we need to reverse the actual text we put into the text field: yData.min+stepSize*(steps-i) This code takes our earlier variables and puts them to work. We start by taking the minimal value possible and then add into it stepSize times the total number of steps subtracted by the number of the current step. Let's dig into the createDots function and see how it works. We start with our setup variables: var yDataLabel = chartInfo.y.label; var xDataLabel = chartInfo.x.label; This is one of my favorite parts of this section. We are grabbing the label from our chartInfo object and using that as our ID; this ID will be used to grab information from our data object. If you wish to change the values, all you need to do is switch the labels in the chartInfo object. Again it's time for us to figure out our ranges as we've done earlier in the fillChart function. This time around we want to get the actual ranges for both the x and y axes and the actual width and height of the area we have to work with: var yDataRange = chartInfo.y.max-chartInfo.y.min; var xDataRange = chartInfo.x.max-chartInfo.x.min; var chartHeight = hei- CHART_PADDING*2; var chartWidth = wid- CHART_PADDING*2; We also need to get a few variables to help us keep track of our current x and y positions within loops: var yPos; var xPos; Let's go deeper into our loop, mainly into the highlighted code snippets: for(var i=0; i<data.length;i++){xPos = CHART_PADDING + (data[i][xDataLabel]-chartInfo.x.min)/xDataRange * chartWidth;yPos = (hei - CHART_PADDING) -(data[i][yDataLabel]-chartInfo.y.min)/yDataRange * chartHeight;context.fillStyle = data[i].style;context.fillRect(xPos-4 ,yPos-4,8,8);} The heart of everything here is discovering where our elements need to be. The logic is almost identical for both the xPos and yPos variables with a few variations. The first thing we need to do to calculate the xPos variable is: (data[i][xDataLabel]-chartInfo.x.min) In this part we are using the label, xDataLabel, we created earlier to get the current student score in that subject. We then subtract from it the lowest possible score. As our chart doesn't start from 0, we don't want the values between 0 and our minimum value to affect the position on the screen. For example, let's say we are focused on math and our student has a score of 80; we subtract 40 out of that (80 - 40 = 40) and then apply the following formula: (data[i][xDataLabel] - chartInfo.x.min) / xDataRange We divide that value by our data range; in our case that would be (100 - 40)/60. The returned result will always be between 0 and 1. We can use the returned number and multiply it by the actual space in pixels to know exactly where to position our element on the screen. We do so by multiplying the value we got, that is between 0 and 1, by the total available space (in this case, width). Once we know where it needs to be located we add the starting point on our chart (the padding): xPos = CHART_PADDING + (data[i][xDataLabel]-chartInfo.x.min)/xDataRange * chartWidth; The yPos variable has the same logic as that of the xPos variable, but here we focus only on the height.
Read more
  • 0
  • 0
  • 1216

article-image-sap-hana-integration-microsoft-excel
Packt
03 Jan 2013
4 min read
Save for later

SAP HANA integration with Microsoft Excel

Packt
03 Jan 2013
4 min read
(For more resources related to this topic, see here.) Once your application is finished inside SAP HANA, and you can see that it performs as expected inside the Studio, you need to be able to deploy it to your users. Asking them to use the Studio is not really practical, and you don’t necessarily want to put the modeling software in the hands of all your users. Reporting on SAP HANA can be done in most of SAP’s Business Objects suite of applications, or in tools which can create and consume MDX queries and data. The simplest of these tools to start with is probably Microsoft Excel. Excel can connect to SAP HANA using the MDX language (a kind of multidimensional SQL) in the form of pivot tables. These in turn allow users to “slice and dice” data as they require, to extract the metrics they need. There are (at time of writing) limitations to the integration with SAP HANA and external reporting tools. These limitations are due to the relative youth of the HANA product, and are being addressed with each successive update to the software. Those listed here are valid for SAP HANA SP04, they may or may not be valid for your version: Hierarchies can only be visualized in Microsoft Excel, not in BusinessObjects Prompts can only be used in Business Objects BI4. Views which use variables can be used in other tools, but only if the variable has a default value (if you don’t have a default value on the variable, then Excel, notably, will complain that the view has been “changed on the server”) In order to make MDX connections to SAP HANA, the SAP HANA Client software is needed. This is separate to the Studio, and must be installed on the client workstation. Like the Studio itself, it can be found on the SAP HANA DVD set, or in the SWDC. Additionally, like the studio, SAP provides a developer download of the client software on SDN, at the following link: http://www.sdn.sap.com/irj/scn/go/portal/prtroot/docs/webcontent/uuid/402aa158-6a7a-2f10-0195-f43595f6fe5f Just download the appropriate version for your Microsoft Office installation. Even if your PC has a 64-bit installation of Windows, you most likely have a 32-bit installation of Office, and you’ll need the 32-bit version of the SAP HANA Client software. If you’re not sure, you can find the information in the Help | About dialog box. In Excel 2010, for example, click on the File tab, then the Help menu entry. The version is specified on the right of the page: Just install the client software like you installed the studio, usually to the default location. Once the software is installed, there is no shortcut created on your desktop, and no entry will be created in your “Start” menu, so don’t be surprised to not see anything to run. We’re going to incorporate our sales simulator in Microsoft Excel, so launch Excel now. Go to the Data tab, and click on From Other Sources, then From Data Connection Wizard, as shown: Next, select Other/Advanced, then SAP HANA MDX provider, and then click Next. The SAP HANA Logon dialog will appear, so enter your Host, Instance, and login information (the same information you use to connect to SAP HANA with the Studio). Click on Test Connection to validate the connection. If the test succeeds, click on OK to choose the CUBE to which you want to connect. In Excel, all your Analytic and Calculation Views are considered to the cubes. Choose your Analytic or Calculation view and click Next. On this screen there’s a checkbox Save password in file – this will avoid having to type in the SAP HANA password every time the Excel file is opened – but the password is stored in the Excel file, which is a little less secure. Click on the Finish button to create the connection to SAP HANA, and your View. On the next screen you’ll be asked where you want to insert the pivot table, just click on OK, to see the results: Congratulations! You now have your reporting application available in Microsoft Excel, showing the same information you could see using the Data Preview feature of the SAP HANA Studio. Resources for Article : Further resources on SAP HANA Starter: SAP NetWeaver: MDM Scenarios and Fundamentals [Article] SAP BusinessObjects: Customizing the Dashboard [Article] SQL Query Basics in SAP Business One [Article]
Read more
  • 0
  • 0
  • 3677

article-image-creating-interactive-graphics-and-animation
Packt
02 Jan 2013
15 min read
Save for later

Creating Interactive Graphics and Animation

Packt
02 Jan 2013
15 min read
(For more resources related to this topic, see here.) Interactive graphics and animations This article showcases MATLAB's capabilities for creating interactive graphics and animations. A static graphic is essentially two dimensional. The ability to rotate the axes and change the view, add annotations in real time, delete data, and zoom in or zoom out adds significantly to the user experience, as the brain is able to process and see more from that interaction. MATLAB supports interactivity with the standard zoom, pan features, a powerful set of camera tools to change the data view, data brushing, and axes linking. The set of functionalities accessible from the figure and camera toolbars are outlined briefly as follows: The steps of interactive exploration can also be recorded and presented as an animation. This is very useful to demonstrate the evolution of the data in time or space or along any dimension where sequence has meaning. Note that some recipes in this article may require you to run the code from the source code files as a whole unit because they were developed as functions. As functions, they are not independently interpretable using the separate code blocks corresponding to each step. Callback functions A mouse drag movement from the top-left corner to bottom-right corner is commonly used for zooming in or selecting a group of objects. You can also program a custom behavior to such an interaction event, by using a callback function. When a specific event occurs (for example, you click on a push button or double-click with your mouse), the corresponding callback function executes. Many event properties of graphics handle objects can be used to define callback functions. In this recipe, you will write callback functions which are essential to implement a slider element to get input from the user on where to create the slice or an isosurface for 3D exploration. You will also see options available to share data between the calling and callback functions. Getting started Load the dataset. Split the data into two main sets—userdataA is a structure with variables related to the demographics and userdataB is a structure with variables related to the Income Groups. Now create a nested structure with these two data structures as shown in the following code snippet: load customCountyData userdataA.demgraphics = demgraphics; userdataA.lege = lege; userdataB.incomeGroups = incomeGroups; userdataB.crimeRateLI = crimeRateLI; userdataB.crimeRateHI = crimeRateHI; userdataB.crimeRateMI = crimeRateMI; userdataB.AverageSATScoresLI = AverageSATScoresLI; userdataB.AverageSATScoresMI = AverageSATScoresMI; userdataB.AverageSATScoresHI = AverageSATScoresHI; userdataB.icleg = icleg; userdataAB.years = years; userdataAB.userdataA = userdataA; userdataAB.userdataB = userdataB; How to do it... Perform the following steps: Run this as a function at the console: c3165_07_01_callback_functions A figure is brought up with a non-standard menu item as highlighted in the following screenshot. Select the By Population item: Here is the resultant figure: Continue to explore the other options to fully exercise the interactivity built into this graphic. How it works... The function c3165_07_01_callback_functions works as follows: A custom menu item Data Groups is created, with additional submenu items—By population, By Income Groups, or Show all. % add main menu item f = uimenu('Label','Data Groups'); % add sub menu items with additional parameters uimenu(f,'Label','By Population','Callback','showData',... 'tag','demographics','userdata',userdataAB); uimenu(f,'Label','By IncomeGroups',... 'Callback','showData','tag','IncomeGroups',... 'userdata',userdataAB); uimenu(f,'Label','ShowAll','Callback','showData',... 'tag','together','userdata',userdataAB); You defined the tag name and the callback function for each submenu item above. Having a tag name makes it easier to use the same callback function with multiple objects because you can query the tag name to find out which object initiated the call to the callback function (if you need that information). In this example, the callback function behavior is dependent upon which submenu item was selected. So the tag property allowed you to use the single function showData as callback for all three submenu items and still implement submenu item specific behavior. Alternately, you could also register three different callback functions and use no tag names. You can specify the value of a callback property in three ways. Here, you gave it a function handle. Alternately, you can supply a string that is a MATLAB command that executes when the callback is invoked. Or, a cell array with the function handle and additional arguments as you will see in the next section. For passing data between the calling and callback function, you also have three options. Here, you set the userdata property to the variable name that has the data needed by the callback function. Note that the userdata is just one variable and you passed a complicated data structure as userdata to effectively pass multiple values. The user data can be extracted from within the callback function of the object or menu item whose callback is executing as follows: userdata = get(gcbo,'userdata'); The second alternative to pass data to callback functions is by means of the application data. This does not require you to build a complicated data structure. Depending on how much data you need to pass, this later option may be the faster mechanism. It also has the advantage that the userdata space cannot inadvertently get overwritten by some other function. Use the setappdata function to pass multiple variables. In this recipe, you maintained the main drawing area axis handles and the custom legend axis handles as application data. setappdata(gcf,'mainAxes',[]); setappdata(gcf,'labelAxes',[]); This was retrieved each time within the executing callback functions, to clear the graphic as new choices are selected by the user from the custom menu. mainAxesHandle = getappdata(gcf,'mainAxes'); labelAxesHandles = getappdata(gcf,'labelAxes'); if ~isempty(mainAxesHandle), cla(mainAxesHandle); [mainAxesHandle, x, y, ci, cd] = ... redrawGrid(userdata.years, mainAxesHandle); else [mainAxesHandle, x, y, ci, cd] = ... redrawGrid(userdata.years); end if ~isempty(labelAxesHandles) for ij = 1:length(labelAxesHandles) cla(labelAxesHandles(ij)); end end The third option to pass data to callback functions is at the time of defining the callback property, where you can supply a cell array with the function handle and additional arguments as you will see in the next section. These are local copies of data passed onto the function and will not affect the global values of the variables. The callback function showData is given below. Functions that you want to use as function handle callbacks must define at least two input arguments in the function definition: the handle of the object generating the callback (the source of the event), the event data structure (can be empty for some callbacks). function showData(src, evt) userdata = get(gcbo,'userdata'); if strcmp(get(gcbo,'tag'),'demographics') % Call grid f drawing code block % Call showDemographics with relevant inputs elseif strcmp(get(gcbo,'tag'),'IncomeGroups') % Call grid drawing code block % Call showIncomeGroups with relevant inputs else % Call grid drawing code block % Call showDemographics with relevant inputs % Call showIncomeGroups with relevant inputs end function labelAxesHandle = ... showDemographics(userdata, mainAxesHandle, x, y, cd) % Function specific code end function labelAxesHandle = ... showIncomeGroups(userdata, mainAxesHandle, x, y, ci) % Function specific code end function [mainAxesHandle x y ci cd] = ... redrawGrid(years, mainAxesHandle) % Grid drawing function specific code end end There's more... This section demonstrates the third option to pass data to callback functions by supplying a cell array with the function handle and additional arguments at the time of defining the callback property. Add a fourth submenu item as follows (uncomment line 45 of the source code): uimenu(f,'Label',... 'Alternative way to pass data to callback',... 'Callback',{@showData1,userdataAB},'tag','blah'); Define the showData1 function as follows (uncomment lines 49 to 51 of the source code): function showData1(src, evt, arg1) disp(arg1.years); end Execute the function and see that the value of the years variable are displayed at the MATLAB console when you select the last submenu Alternative way to pass data to callback option. Takeaways from this recipe: Use callback functions to define custom responses for each user interaction with your graphic Use one of the three options for sharing data between calling and callback functions—pass data as arguments with the callback definition, or via the user data space, or via the application data space, as appropriate See also Look up MATLAB help on the setappdata, getappdata, userdata property, callback property, and uimenu commands. Obtaining user input from the graph User input may be desired for annotating data in terms of adding a label to one or more data points, or allowing user settable boundary definitions on the graphic. This recipe illustrates how to use MATLAB to support these needs. Getting started The recipe shows a two-dimensional dataset of intensity values obtained from two different dye fluorescence readings. There are some clearly identifiable clusters of points in this 2D space. The user is allowed to draw boundaries to group points and identify these clusters. Load the data: load clusterInteractivData The imellipse function from the MATLAB image processing toolboxTM is used in this recipe. Trial downloads are available from their website. How to do it... The function constitutes the following steps: Set up the user data variables to share the data between the callback functions of the push button elements in this graph: userdata.symbChoice = {'+','x','o','s','^'}; userdata.boundDef = []; userdata.X = X; userdata.Y = Y; userdata.Calls = ones(size(X)); set(gcf,'userdata',userdata); Make the initial plot of the data: plot(userdata.X,userdata.Y,'k.','Markersize',18); hold on; Add the push button elements to the graphic: uicontrol('style','pushbutton',... 'string','Add cluster boundaries?', ... 'Callback',@addBound, ... 'Position', [10 21 250 20],'fontsize',12); uicontrol('style','pushbutton', ... 'string','Classify', ... 'Callback',@classifyPts, ... 'Position', [270 21 100 20],'fontsize',12); uicontrol('style','pushbutton', ... 'string','Clear Boundaries', ... 'Callback',@clearBounds, ... 'Position', [380 21 150 20],'fontsize',12); Define callback for each of the pushbutton elements. The addBound function is for defining the cluster boundaries. The steps are as follows: % Retrieve the userdata data userdata = get(gcf,'userdata'); % Allow a maximum of four cluster boundary definitions if length(userdata.boundDef)>4 msgbox('A maximum of four clusters allowed!'); return; end % Allow user to define a bounding curve h=imellipse(gca); % The boundary definition is added to a cell array with % each element of the array storing the boundary def. userdata.boundDef{length(userdata.boundDef)+1} = ... h.getPosition; set(gcf,'userdata',userdata); The classifyPts function draws points enclosed in a given boundary with a unique symbol per boundary definition. The logic used in this classification function is simple and will run into difficulties with complex boundary definitions. However, that is ignored as that is not the focus of this recipe. Here, first find points whose coordinates lie in the range defined by the coordinates of the boundary definition. Then, assign a unique symbol to all points within that boundary: for i = 1:length(userdata.boundDef) pts = ... find( (userdata.X>(userdata.boundDef{i}(:,1)))& ... (userdata.X<(userdata.boundDef{i}(:,1)+ ... userdata.boundDef{i}(:,3))) &... (userdata.Y>(userdata.boundDef{i}(:,2)))& ... (userdata.Y<(userdata.boundDef{i}(:,2)+ ... userdata.boundDef{i}(:,4)))); userdata.Calls(pts) = i; plot(userdata.X(pts),userdata.Y(pts), ... [userdata.colorChoice{i} '.'], ... 'Markersize',18); hold on; end The clearBounds function clears the drawn boundaries and removes the clustering based upon those boundary definitions. function clearBounds(src, evt) cla; userdata = get(gcf,'userdata'); userdata.boundDef = []; set(gcf,'userdata',userdata); plot(userdata.X,userdata.Y,'k.','Markersize',18); hold on; end Run the code and define cluster boundaries using the mouse. Note that until you click the on the Classify button, classification does not occur. Here is a snapshot of how it looks (the arrow and dashed boundary is used to depict the cursor movement from user interaction): Initiate a classification by clicking on Classify. The graph will respond by re-drawing all points inside the constructed boundary with a specific symbol: How it works... This recipe illustrates how user input is obtained from the graphical display in order to impact the results produced. The image processing toolbox has several such functions that allow user to provide input by mouse clicks on the graphical display—such as imellipse for drawing elliptical boundaries, and imrect for drawing rectangular boundaries. You can refer to the product pages for more information. Takeaways from this recipe: Obtain user input directly via the graph in terms of data point level annotations and/or user settable boundary definitions See also Look up MATLAB help on the imlineimpoly, imfreehandimrect, and imelli pseginput commands. Linked axes and data brushing MATLAB allows creation of programmatic links between the plot and the data sources and linking different plots together. This feature is augmented by support for data brushing, which is a way to select data and mark it up to distinguish from others. Linking plots to their data source allows you to manipulate the values in the variables and have the plot automatically get updated to reflect the changes. Linking between axes enables actions such as zoom or pan to simultaneously affect the view in all linked axes. Data brushing allows you to directly manipulate the data on the plot and have the linked views reflect the effect of that manipulation and/or selection. These features can provide a live and synchronized view of different aspects of your data. Getting ready You will use the same cluster data as the previous recipe. Each point is denoted by an x and y value pair. The angle of each point can be computed as the inverse tangent of the ratio of the y value to the x value. The amplitude of each point can be computed as the square root of the sum of squares of the x and y values. The main panel in row 1 show the data in a scatter plot. The two plots in the second row have the angle and amplitude values of each point respectively. The fourth and fifth panels in the third row are histograms of the x and y values respectively. Load the data and calculate the angle and amplitude data as described earlier: load clusterInteractivData data(:,1) = X; data(:,2) = Y; data(:,3) = atan(Y./X); data(:,4) = sqrt(X.^2 + Y.^2); clear X Y How to do it... Perform the following steps: Plot the raw data: axes('position',[.3196 .6191 .3537 .3211], ... 'Fontsize',12); scatter(data(:,1), data(:,2),'ks', ... 'XDataSource','data(:,1)','YDataSource','data(:,2)'); box on; xlabel('Dye 1 Intensity'); ylabel('Dye 1 Intensity');title('Cluster Plot'); Plot the angle data: axes('position',[.0682 .3009 .4051 .2240], ... 'Fontsize',12); scatter(1:length(data),data(:,3),'ks',... 'YDataSource','data(:,3)'); box on; xlabel('Serial Number of Points'); title('Angle made by each point to the x axis'); ylabel('tan^{-1}(Y/X)'); Plot the amplitude data: axes('position',[.5588 .3009 .4051 .2240], ... 'Fontsize',12); scatter(1:length(data),data(:,4),'ks', ... 'YDataSource','data(:,4)'); box on; xlabel('Serial Number of Points'); title('Amplitude of each point'); ylabel('{surd(X^2 + Y^2)}'); Plot the two histograms: axes('position',[.0682 .0407 .4051 .1730], ... 'Fontsize',12); hist(data(:,1)); title('Histogram of Dye 1 Intensities'); axes('position',[.5588 .0407 .4051 .1730], ... 'Fontsize',12); hist(data(:,2)); title('Histogram of Dye 2 Intensities'); The output is as follows: Programmatically, link the data to their source: linkdata; Programmatically, turn brushing on and set the brush color to green: h = brush; set(h,'Color',[0 1 0],'Enable','on'); Use mouse movements to brush a set of points. You could do this on any one of the first three panels and observe the impact on corresponding points in the other graphs by its turning green. (The arrow and dashed boundary is used to depict the cursor movement from user interaction in the following figure): How it works... Because brushing is turned on, when you focus the mouse on any of the graph areas, a cross hair shows up at the cursor. You can drag to select an area of the graph. Points falling within the selected area are brushed to the color green, for the graphs on rows 1 and 2. Note that nothing is highlighted on the histograms at this point. This is because the x and y data source for the histograms is not correctly linked to the data source variables yet. For the other graphs, you programmatically set their x and y data source via the XDataSource and the YDataSource properties. You can also define the source data variables to link to a graphic and turn brushing on by using the icons from the figure toolbar as shown in the following screenshot. The first circle highlights the brush button; the second circle highlights the link data button. You can click on the Edit link pointed by the arrow to exactly define the x and y sources: There's more... To define the source data variables to link to a graphic and turn brushing on by using the icons from the Figure toolbar, do as follows: Clicking on Edit (pointed to in preceding figure) will bring up the following window: Enter data(:,1) in the YDataSource column for row 1 and data(:,2) in the YDataSource column for row 2. Now try brushing again. Observe that bins of the histogram get highlights in a bottom up order as corresponding points get selected (again, the arrow and dashed boundary is used to depict the cursor movement from user interaction): Link axes together to simultaneously investigate multiple aspects of the same data point. For example, in this step you plot the cluster data alongside a random quality value for each point of the data. Link the axes such that zoom and pan functions on either will impact the axes of the other linked axes: axes('position',[.13 .11 .34 .71]); scatter(data(:,1), data(:,2),'ks');box on; axes('position',[.57 .11 .34 .71]); scatter(data(:,1), data(:,2),[],rand(size(data,1),1), ... 'marker','o', 'LineWidth',2);box on; linkaxes; The output is as follows. Experiment with zoom and pan functionalities on this graph. Takeaways from this recipe: Use data brushing and linked axes features to provide a live and synchronized view of different aspects of your data. See also Look up MATLAB help on the linkdata, linkaxes, and brush commands.
Read more
  • 0
  • 0
  • 2287
Visually different images

article-image-meet-qlikview
Packt
13 Dec 2012
15 min read
Save for later

Meet QlikView

Packt
13 Dec 2012
15 min read
(For more resources related to this topic, see here.) What is QlikView? QlikView is developed by QlikTech, a company that was founded in Sweden in 1993, but has since moved its headquarters to the US. QlikView is a tool used for Business Intelligence, often shortened to BI. Business Intelligence is defined by Gartner, a leading industry analyst firm, as: An umbrella term that includes the application, infrastructure and tools, and best practices that enable access to and analysis of information to improve and optimize decisions and performance. Following this definition, QlikView is a tool that enables access to information in order to analyze this information, which in turn improves and optimizes business decisions and performance. Historically, BI has been very much IT-driven. IT departments were responsible for the entire Business Intelligence life cycle, from extracting the data to delivering the final reports, analyses, and dashboards. While this model works very well for delivering predefined static reports, most businesses find that it does not meet the needs of their business users. As IT tightly controls the data and tools, users often experience long lead-times whenever new questions arise that cannot be answered with the standard reports. How does QlikView differ from traditional BI? QlikTech prides itself in taking an approach to Business Intelligence that is different from what companies such as Oracle, SAP, and IBM—described by QlikTech as traditional BI vendors—are delivering. They aim to put the tools in the hands of business users, allowing them to become self-sufficient because they can perform their own analyses. Independent industry analyst firms have noticed this different approach as well. In 2011, Gartner created a subcategory for Data Discovery tools in its yearly market evaluation, the Magic Quadrant Business Intelligence platform. QlikView was named the poster child for this new category of BI tools. QlikTech chooses to describe itself as a Business Discovery enterprise instead of Data Discovery enterprise. It believes that discovering business insights is much more important than discovering data. The following diagram outlines this paradigm: Besides the difference in who uses the tool — IT users versus business users — there are a few other key features that differentiate QlikView from other solutions. Associative user experience The main difference between QlikView and other BI solutions is the associative user experience. Where traditional BI solutions use predefined paths to navigate and explore data, QlikView allows users to take whatever route they want. This is a far more intuitive way to explore data. QlikTech describes this as "working the way your mind works." An example is shown in the following image. While in a typical BI solution, we would need to start by selecting a Region and then drill down step-by-step through the defined drill path, in QlikView we can choose whatever entry point we like — Region, State, Product, or Sales Person. We are then shown only the data related to that selection, and in our next selection we can go wherever we want. It is infinitely flexible. Additionally, the QlikView user interface allows us to see which data is associated with our selection. For example, the following screenshot (from QlikTech's What's New in QlikView 11 demo document) shows a QlikView Dashboard in which two values are selected. In the Quarter field, Q3 is selected, and in the Sales Reps field, Cart Lynch is selected. We can see this because these values are green, which in QlikView means that they have been selected. When a selection is made, the interface automatically updates to not only show which data is associated with that selection, but also which data is not associated with the selection. Associated data has a white background, while non-associated data has a gray background. Sometimes the associations can be pretty obvious; it is no surprise that the third quarter is associated with the months July, August, and September. However, at other times, some not-so-obvious insights surface, such as the information that Cart Lynch has not sold any products in Germany or Spain. This extra information, not featured in traditional BI tools, can be of great value, as it offers a new starting point for investigation. Technology QlikView's core technological differentiator is that it uses an in-memory data model, which stores all of its data in RAM instead of using disk. As RAM is much faster than disk, this allows for very fast response times, resulting in a very smooth user-experience. Adoption path There is also a difference between QlikView and traditional BI solutions in the way it is typically rolled out within a company. Where traditional BI suites are often implemented top-down—by IT selecting a BI tool for the entire company—QlikView often takes a bottom-up adoption path. Business users in a single department adopt it, and its use spreads out from there. QlikView is free of charge for single-user use. This is called the Personal Edition or PE. Documents created in Personal Edition can be opened by fully-licensed users or deployed on a QlikView server. The limitation is that, with the exception of some documents enabled for PE by QlikTech, you cannot open documents created elsewhere, or even your own documents if they have been opened and saved by another user or server instance. Often, a business user will decide to download QlikView to see if he can solve a business problem. When other users within the department see the software, they get enthusiastic about it, so they too download a copy. To be able to share documents, they decide to purchase a few licenses for the department. Then other departments start to take notice too, and QlikView gains traction within the organization. Before long, IT and senior management also take notice, eventually leading to enterprise-wide adoption of QlikView. QlikView facilitates every step in this process, scaling from single laptop deployments to full enterprise-wide deployments with thousands of users. The following graphic demonstrates this growth within an organization: As the popularity and track record of QlikView have grown, it has gotten more and more visibility at the enterprise level. While the adoption path described before is still probably the most common adoption path, it is not uncommon nowadays for a company to do a top-down, company-wide rollout of QlikView. Exploring data with QlikView Now that we know what QlikView is and how it is different from traditional BI offerings, we will learn how we can explore data within QlikView. Getting QlikView Of course, before we can start exploring, we need to install QlikView. You can download QlikView's Personal Edition from http://www.qlikview.com/download. You will be asked to register on the website, or log in if you have registered before. Registering not only gives you access to the QlikView software, but you can also use it to read and post on the QlikCommunity (http://community.qlikview.com) which is the QlikTech's user forum. This forum is very active and many questions can be answered by either a quick search or by posting a question. Installing QlikView is very straightforward, simply double-click on the executable file and accept all default options offered. After you are done installing it, launch the QlikView application. QlikView will open with the start page set to the Getting Started tab, as seen in the following screenshot: The example we will be using is the Movie Database, which is an example document that is supplied with QlikView. Find this document by scrolling down the Examples list (it is around halfway down the list) and click to open it. The opening screen of the document will now be displayed: Navigating the document Most QlikView documents are organized into multiple sheets. These sheets often display different viewpoints on the same data, or display the same information aggregated to suit the needs of different types of users. An example of the first type of grouping might be a customer or marketing view of the data, an example of the second type of grouping might be a KPI dashboard for executives, with a more in-depth sheet for analysts. Navigating the different sheets in a QlikView document is typically done by using the tabs at the top of the sheet, as shown in the following screenshot. More sophisticated designs may opt to hide the tab row and use buttons to switch between the different sheets. The tabs in the Movie Database document also follow a logical order. An introduction is shown on the Intro tab, followed by a demonstration of the key concept of QlikView on the How QlikView works tab. After the contrast with Traditional OLAP is shown, the associative QlikView Model is introduced. The last two tabs show how this can be leveraged by showing a concrete Dashboard and Analysis:     Slicing and dicing your data As we saw when we learned about the associative user experience, any selections made in QlikView are automatically applied to the entire data model. As we will see in the next section, slicing and dicing your data really is as easy as clicking and viewing! List-boxes But where should we click? QlikView lets us select data in a number of ways. A common method is to select a value from a list-box. This is done by clicking in the list-box. Let's switch to the How QlikView works tab to see how this works. We can do this by either clicking on the How QlikView works tab on the top of the sheet, or by clicking on the Get Started button. The selected tab shows two list boxes, one containing Fruits and the other containing Colors. When we select Apple in the Fruits list-box, the screen automatically updates to show the associated data in the Colors list-box: Green and Red. The color Yellow is shown with a gray background to indicate that it is not associated, as seen below, since there are no yellow apples. To select multiple values, all we need to do is hold down Ctrl while we are making our selection. Selections in charts Besides selections in list-boxes, we can also directly select data in charts. Let's jump to the Dashboard tab and see how this is done. The Dashboard tab contains a chart labeled Number of Movies, which lists the number of movies by a particular actor. If we wish to select only the top three actors, we can simply drag the pointer to select them in the chart, instead of selecting them from a list-box: Because the selection automatically cascades to the rest of the model, this also results in the Actor list-box being updated to reflect the new selection: Of course, if we want to select only a single value in a chart, we don't necessarily need to lasso it. Instead, we can just click on the data point to select it. For example, clicking on James Stewart leads to only that actor being selected. Search While list-boxes and lassoing are both very convenient ways of selecting data, sometimes we may not want to scroll down a big list looking for a value that may or may not be there. This is where the search option comes in handy. For example, we may want to run a search for the actor Al Pacino. To do this, we first activate the corresponding list-box by clicking on it. Next, we simply start typing and the list-box will automatically be updated to show all values that match the search string. When we've found the actor we're looking for, Al Pacino in this case, we can click on that value to select it: Sometimes, we may want to select data based on associated values. For example, we may want to select all of the actors that starred in the movie Forrest Gump. While we could just use the Title list-box, there is also another option: associated search. To use associated search, we click on the chevron on the right-hand side of the search box. This expands the search box and any search term we enter will not only be checked against the Actor list-box, but also against the contents of the entire data model. When we type in Forrest Gump, the search box will show that there is a movie with that title, as seen in the screenshot below. If we select that movie and click on Return, all actors which star in the movie will be selected. Bookmarking selections Inevitably, when exploring data in QlikView, there comes a point where we want to save our current selections to be able to return to them later. This is facilitated by the bookmark option. Bookmarks are used to store a selection for later retrieval. Creating a new bookmark To create a new bookmark, we need to open the Add Bookmark dialog. This is done by either pressing Ctrl + B or by selecting Bookmark | Add Bookmark from the menu. In the Add Bookmark dialog, seen in the screenshot below, we can add a descriptive name for the bookmark. Other options allow us to change how the selection is applied (as either a new selection or on top of the existing selection) and if the view should switch to the sheet that was open at the time of creating the bookmark. The Info Text allows for a longer description to be entered that can be shown in a pop-up when the bookmark is selected. Retrieving a bookmark We can retrieve a bookmark by selecting it from the Bookmarks menu, seen here: Undoing selections Fortunately, if we end up making a wrong selection, QlikView is very forgiving. Using the Clear, Back, and Forward buttons in the toolbar, we can easily clear the entire selection, go back to what we had in our previous selections, or go forward again. Just like in our Internet browser, the Back button in QlikView can take us back multiple steps: Changing the view Besides filtering data, QlikView also lets us change the information being displayed. We'll see how this is done in the following sections. Cyclic Groups Cyclic Groups are defined by developers as a list of dimensions that can be switched between users. On the frontend, they are indicated with a circular arrow. For an example of how this works, let's look at the Ratio to Total chart, seen in the following image. By default, this chart shows movies grouped by duration. If we click on the little downward arrow next to the circular arrow, we will see a list of alternative groupings. Click on Decade to switch to the view to movies grouped by decade. Drill down Groups Drill down Groups are defined by the developer as a hierarchical list of dimensions which allows users to drill down to more detailed levels of the data. For example, a very common drill down path is Year | Quarter | Month | Day. On the frontend, drill down groups are indicated with an upward arrow. In the Movies Database document, a drill down can be found on the tab labeled Traditional OLAP. Let's go there. This drill down follows the path Director | Title | Actor. Click on the Director A. Edward Sutherland to drill down to all movies that he directed, shown in the following screenshot. Next, click on Every Day's A Holiday to see which actors starred in that movie. When drilling down, we can always go back to the previous level by clicking on the upward arrow, located at the top of the list-box in this example. Containers Containers are used to alternate between the display of different objects in the same screen space. We can select the individual objects by selecting the corresponding tab within the container. Our Movies Database example includes a container on the Analysis sheet. The container contains two objects, a chart showing Average length of Movies over time and a table showing the Movie List, shown in the following screenshot. The chart is shown by default, you can switch to the Movie List by clicking on the corresponding tab at the top of the object.   On the time chart, we can switch between Average length of Movies and Movie List by using the tabs at the top of the container object. But wait, there's more! After all of the slicing, dicing, drilling, and view-switching we've done, there is still the question on our minds: how can we export our selected data to Excel? Fortunately, QlikView is very flexible when it comes to this, we can simply right-click on any object and choose Send to Excel, or, if it has been enabled by the developer, we can click on the XL icon in an object's header.     Click on the XL icon in the Movie List table to export the list of currently selected movies to Excel. A word of warning when exporting data When viewing tables with a large number of rows, QlikView is very good at only rendering those rows that are presently visible on the screen. When Export values to Excel is selected, all values must be pulled down into an Excel file. For large data sets, this can take a considerable amount of time and may cause QlikView to become unresponsive while it provides the data.
Read more
  • 0
  • 0
  • 4115

article-image-managing-files
Packt
05 Dec 2012
16 min read
Save for later

Managing Files

Packt
05 Dec 2012
16 min read
(For more resources related to this topic, see here.) Managing local files In this section we will look at local file operations. We'll cover common operations that all computer users will be familiar with—copying, deleting, moving, renaming, and archiving files. We'll also look at some not-so-common techniques, such as timestamping files, checking for the existence of a file, and listing the files in a directory. Copying files For our first file job, let's look at a simple file copy process. We will create a job that looks in a specific directory for a file and copies it to another location. Let's do some setup first (we can use this for all of the file examples). In your project directory, create a new folder and name it FileManagement. Within this folder, create two more folders and name them Source and Target. In the Source directory, drop a simple text file and name it original.txt. Now let's create our job: Create a new folder in Repository and name it Chapter6 Create a new job within the Chapter6 directory and name it FileCopy. In the Palette, search for copy. You should be able to locate a tFileCopy component. Drop this onto the Job Designer. Click on its Component tab. Set the File Name field to point to the original.txt file in the Source directory. Set the Destination directory field to direct to the Target directory. For now, let's leave everything else unchanged. Click on the Run tab and then click on the Run button. The job should complete pretty quickly and, because we only have a single component, there are now data fl ows to observe. Check your Target folder and you will see the original.txt file in there, as expected. Note that the file still remains in the Source folder, as we were simply copying the file. Copying and removing files Our next example is a variant of our first file management job. Previously, we copied a file from one folder to another, but often you will want to affect a file move. To use an analogy from desktop operating systems and programs, we want to do a cut and paste rather than a copy and paste. Open the FileCopy job and follow the given steps: Remove the original.txt file from the Target directory, making sure it still exists in the Source directory. In the Basic settings tab of the tFileCopy component, select the checkbox for Remove source file. Now run the job. This time the original.txt file will be copied to the Target directory and then removed from the Source directory. Renaming files We can also use the tFileCopy component to rename files as we copy or move. Again, let's work with the FileCopy job we have created previously. Reset your Source and Target directories so that the original.txt file only exists in Source. In the Basic settings tab, check the Rename checkbox. This will reveal a new parameter, Destination filename. Change the default value of the Destination filename parameter to modified_name.txt. Run the job. The original file will be copied to the Target directory and renamed. The original file will also be removed from the Source directory. Deleting files It is really useful to be able to delete files. For example, once they have been transformed or processed into other systems. Our integration jobs should "clean up afterwards", rather than leaving lots of interim files cluttering up the directories. In this job example we'll delete a file from a directory.This is a single-component job. Create a new job and name it FileDelete. In your workspace directory, FileManagement/Source, create a new text file and name it file-to-delete.txt. From the Palette, search for filedelete and drag a tFileDelete component onto the Job Designer. Click on its Component tab to configure it. Change the File Name parameter to be the path to the file you created earlier in step 2. Run the job. After it is complete, go to your Source directory and the file will no longer be there. Note that the file does not get moved to the recycle bin on your computer, but is deleted immediately. Timestamping a file Sometimes in real life use, integration jobs, like any software, can fail or give an error. Server issues, previously unencountered bugs, or a host of other things can cause a job to behave in an unexpected manner, and when this happens, manual intervention may be needed to investigate the issue or recover the job that failed. A useful trick to try to incorporate into your jobs is to save files once they have been consumed or processed, in case you need to re-process them again at some point or, indeed, just for investigation and debugging purposes should something go wrong. A common way to save files is to rename them using a date/timestamp. By doing this you can easily identify when files were processed by the job. Follow the given steps to achieve this: Create a new job and call it FileTimestamp. Create a file in the Source directory named timestamp.txt. The job is going to move this to the Target directory, adding a time-stamp to the file as it processes. From the Palette, search for filecopy and drop a tFileCopy component onto the Job Designer. Click on its Component tab and change the File Name parameter to point to the timestamp.txt file we created in the Source directory. Change the Destination Directory to direct to your Target directory. Check the Rename checkbox and change the Destination filename parameter to "timestamp"+TalendDate.getDate("yyyyMMddhhmmss")+".txt". The previous code snippet concatenates the fixed file name, "timestamp", with the current date/time as generated by the Studio's getDate function at runtime. The file extension ".txt" is added to the end too. Run the job and you will see a new version of the original file drop into the Target directory, complete with timestamp. Run the job again and you will see another file in Target with a different timestamp applied. Depending on your requirements you can configure different format timestamps. For example, if you are only going to be processing one file a day, you could dispense with the hours, minutes, and second elements of the timestamp and simply set the output format to "yyyyMMdd". Alternatively, to make the timestamp more readable, you could separate its elements with hyphens—"yyyy-MM-dd", for example. You can find more information about Java date formats at http://docs.oracle.com/javase/6/docs/api/java/text/SimpleDateFormat.html.. Listing files in a directory Our next example job will show how to list all of the files (or all the files matching a specific naming pattern) in a directory. Where might we use such a process? Suppose our target system had a data "drop-off" directory, where all integration files from multiple sources were placed before being picked up to be processed. As an example, this drop-off directory might contain four product catalogue XML files, three CSV files containing inventory data, and 50 order XML files detailing what had been ordered by the customers. We might want to build a catalogue import process that picks up the four catalogue files, processes them by mapping to a different format, and then moves them to the catalogue import directory. The nature of the processing means we have to deal with each file individually, but we want a single execution of the process to pick up all available files at that point in time. This is where our file listing process comes in very handy and, as you might expect, the Studio has a component to help us with this task. Follow the given steps: Let's start by preparing the directory and files we want to list. Copy the FileList directory from the resource files to the FileManagement directory we created earlier. The FileList directory contains six XML files. Create a new job and name it FileList. Search for Filelist in the Palette and drop a tFileList component onto the Job Designer. Additionally, search for logrow and drop a tLogRow component onto the designer too. We will use the tFileList component to read all of the filenames in the directory and pass this through to the tLogRow component. In order to do this, we need to connect the tFileList and tLogRow. The tFileList component works in an iterative manner—it reads each filename and passes it onwards before getting the next filename. Its connector type is Iterative, rather than the more common Main connector. However, we cannot connect an iterative component to the tLogRow component, so we need to introduce another component that will act as an intermediary between the two. Search for iteratetoflow in the Palette and drop a tIterateToFlow component onto the Job Designer. This bridges the gap between an iterate component and a fl ow component. Click on the tFileList component and then click on its Component tab. Change the directory value so that it points to the FileList directory we created in step 1. Click on the + button to add a new row to the File section. Change the value to "*.xml". This configures the component to search for any files with an XML extension. Right-click on the tFileList component, select Row | Iterate, and drop the resulting connector onto the tIterateToFlow component. The tIterateToFlow component requires a schema and, as the tFileList component does not have a schema, it cannot propagate this to the iterateto-flow component when we join them. Instead we will have to create the schema directly. Click on the tIterateToFlow component and then on its Component tab. Click on the Edit schema button and, in the pop-up schema editor, click on the + button to add a row and then rename the column value to filename. Click on OK to close the window. A new row will be added to the Mapping table. We need to edit its value, so click in the Value column, delete the setting that exists, and press Ctrl + space bar to access the global variables list. Scroll through the global variable drop-down list and select "tFileList_1_CURRENT_FILE". This will add the required parameter to the Value column. Right-click on the tIterateToFlow component, select Row | Main, and connect this to the tLogRow component. Let's run the job. It may run too quickly to be visible to the human eye, but the tFileList component will read the name of the first file it finds, pass this forward to the tIterateToFlow component, go back and read the second file, and so on. As the iterate-to-flow component receives its data, it will pass this onto tLogRow as row data. You will see the following output in the tLogRow component: Now that we have cracked the basics of the file list component, let's extend the example to a real-life situation. Let's suppose we have a number of text files in our input directory, all conforming to the same schema. In the resources directory, you will find five files named fileconcat1.txt, fileconcat2.txt, and so on. Each of these has a "random" number of rows. Copy these files into the Source directory of your workspace. The aim of our job is to pick up each file in turn and write its output to a new file, thereby concatenating all of the original files. Let's see how we do this: Create a new job and name it FileConcat. For this job we will need a file list component, a delimited file output component, and a delimited file input component. As we will see in a minute, the delimited input component will be a "placeholder" for each of the input files in turn. Find the components in the Palette and drop them onto the Job Designer. Click on the file list component and change its Directory value to point to the Source directory. In the Files box, add a row and change the Filemask value to "*.txt". Right-click on the file list component and select Row | Iterate. Drop the connector onto the delimited input component. Select the delimited input component and edit its schema so that it has a single field rowdata of data type String We need to modify the File name/Stream value, but in this case it is not a fixed file we are looking for but a different file with each iteration of the file list component. TOS gives us an easy way to add such variables into the component definitions. First, though, click on the File name/Stream box and clear the default value. In the bottom-left corner of the Studio you should see a window named Outline. If you cannot see the Outline window, select Window | Show View from the menu bar and type outline into the pop-up search box. You will see the Outline view in the search results—double click on this to open it. Now that we can see the Outline window, expand the tFileList item to see the variables available in it. The variables are different depending upon the component selected. In the case of a file list component, the variables are mostly attributes of the current file being processed. We are interested in the filename for each iteration, so click on the variable Current File Name with path and drag it to the File name/Stream box in the Component tab of the delimited input component. You can see that the Studio completes the parameter value with a globalMap variable—in this case, tFileList_1_CURRENT_FILEPATH, which denotes the current filename and its directory path. Now right-click on the delimited input, select Row | Main, and drop the connector onto the delimited output. Change the File Name of the delimited output component to fileconcatout.txt in our target directory and check the Append checkbox, so that the Studio adds the data from each iteration to the bottom of each file. If Append is not checked, then the Studio will overwrite the data on each iteration and all that will be left will be the data from the final iteration. Run the job and check the output file in the target directory. You will see a single file with the contents of the five original files in it. Note that the Studio shows the number of iterations of the file list component that have been executed, but does not show the number of lines written to the output file, as we are used to seeing in non-iterative jobs. Checking for files Let's look at how we can check for the existence of a file before we undertake an operation on it. Perhaps the first question is "Why do we need to check if a file exists?" To illustrate why, open the FileDelete job that we created earlier. If you look at its component configuration, you will see that it will delete a file named file-todelete. txt in the Source directory. Go to this directory using your computer's file explorer and delete this file manually. Now try to run the FileDelete job. You will get an error when the job executes: The assumption behind a delete component (or a copy, rename, or other file operation process) is that the file does, in fact, exist and so the component can do its work. When the Studio finds that the file does not exist, an error is produced. Obviously, such an error is not desirable. In this particular case nothing too untoward happens—the job simply errors and exits—but it is better if we can avoid unnecessary errors. What we should really do here is check if the file exists and, if it does, then delete it. If it does not exist, then the delete command should not be invoked. Let's see how we can put this logic together Create a new job and name it FileExist. Search for fileexist in the Palette and drop a tFileExist component onto the Job Designer. Then search for filedelete and place a tFileDelete component onto the designer too. In our Source directory, create a file named file-exist.txt and configure File Name of the tFileDelete component to point to this. Now click on the tFileExist component and set its File name/Stream parameter to be the same file in the Source directory. Right-click on the tFileExist component, select Trigger | Run if, and drop the connector onto the tFileDelete component. The connecting line between the two components is labeled If. When our job runs the first component will execute, but the second component, tFileDelete, will only run if some conditions are satisfied. We need to configure the if conditions. Click on If and, in the Component tab, a Condition box will appear. In the Outline window (in the bottom-left corner of the Studio), expand the tFileExist component. You will see three attributes there. The Exists attribute is highlighted in red in the following screenshot: Click on the Exists attribute and drag it into the Conditions box of the Component tab. As before, a global-map variable is written to the configuration. The logic of our job is as follows: i. Run the tFileExist component. ii. If the file named in tFileExist actually exists, run the tFileDelete component.    Note that if the file does not exist, the job will exit. We can check if the job works as expected by running it twice. The file we want to delete is in the Source directory, so we would expect both components to run on the first execution (and for the file to be deleted). When the if condition is evaluated, the result will show in the Job Designer view. In this case, the if condition was true—the file did exist. Now try to run the job again. We know that the file we are checking for does not exist, as it was deleted on the last execution. This time, the if condition evaluates to false, and the delete component does not get invoked. You can also see in the console window that the Studio did not log any errors. Much better! Sometimes we may want to verify that a file does not exist before we invoke another component. We can achieve this in a similar way to checking for the existence of a file, as shown earlier. Drag the Exists variable into the Conditions box and prefix the statement with !—the Java operator for "not": !((Boolean)globalMap.get("tFileExist_1_EXISTS"))
Read more
  • 0
  • 0
  • 2391

article-image-securing-data-rest-oracle-11g
Packt
23 Oct 2012
11 min read
Save for later

Securing Data at Rest in Oracle 11g

Packt
23 Oct 2012
11 min read
Introduction The Oracle physical database files are primarily protected by filesystem privileges. An attacker who has read permissions on these files will be able to steal the entire database or critical information such as datafiles containing credit card numbers, social security numbers, or other types of private information. Other threats are related to data theft from storage mediums where the physical database resides. The same applies for unprotected backups or dumps that can be easily restored or imported. The data in the database is stored in proprietary format that is quite easy to decipher. There are several sites and specialized tools available to extract data from datafiles, backups, and dumps, known generically as Data Unloading ( DUL). These tools are usually the last solution when the database is corrupted and there is no backup available for restore and recovery. As you probably have already guessed, they can be used by an attacker for data extraction from stolen databases or dumps (summary descriptions and links to several DUL tools can be found at http://www.oracle-internals.com/?p=17 Blvd). The technology behind DUL utilities is based on understanding how Oracle keeps the data in datafiles behind the scenes (a very good article about Oracle datafile internals, written by Rodrigo Righetti, can be found at http://docs.google.com/Doc?id=df2mxgvb_1dgb9fv). Once you decipher the mechanism you will be able to build your tool with little effort. One of the best methods for protecting data at rest is encryption. We can enumerate the following as data encryption methods, described in this chapter for using with Oracle database: Operating system proprietary filesystem or block-based encryption Cryptographic API, especially DBMS_CRYPTO used for column encryption Transparent Data Encryption for encrypting columns, tablespaces, dumps, and RMAN backups Using block device encryption By using block device encryption the data is encrypted and decrypted at block-device level. The block device can be formatted with a filesystem. The decryption is performed once the filesystem is mounted by the operating system, transparently for users. This type of encryption protects best against media theft and can be used for datafile placement. In this recipe we will add a new disk and implement block-level encryption with Linux Unified Key Setup-on-disk-format (LUKS). Getting ready All steps will be performed with nodeorcl1 as root. How to do it... Shut down nodeorcl1, then add a new disk to the nodeorcl1 system and boot it. Our new device will be seen by the operating system as /dev/sdb . Next, create a new partition /dev/sdb1 using fdisk as follows: [root@nodeorcl1 ~]# fdisk /dev/sdb WARNING: DOS-compatible mode is deprecated. It's strongly recommended to switch off the mode (command 'c') and change display units to sectors (command 'u'). Command (m for help): n Command action e extended p primary partition (1-4) p Partition number (1-4): 1 First cylinder (1-5577, default 1): Using default value 1 Last cylinder, +cylinders or +size{K,M,G} (1-5577, default 5577): Using default value 5577 Command (m for help): w The partition table has been altered! Calling ioctl() to re-read partition table. Syncing disks. Format and add a passphrase for encryption on /dev/sdb1 device with cryptsetup utility as follows: [root@nodeorcl1 dev]# cryptsetup luksFormat /dev/sdb1 WARNING! ======== This will overwrite data on /dev/sdb1 irrevocably. Are you sure? (Type uppercase yes): YES Enter LUKS passphrase: P5;@o[]klopY&P] Verify passphrase: P5;@o[]klopY&P] [root@nodeorcl1 dev]# The access on the encrypted device is not performed directly; all operations are performed through a device-mapper. Open the device-mapper for /dev/sdb1 as follows: [root@nodeorcl1 mapper]# cryptsetup luksOpen /dev/sdb1 storage Enter passphrase for /dev/sdb1: P5;@o[]klopY&P] [root@nodeorcl1 mapper]# [root@nodeorcl1 mapper]# ls -al /dev/mapper/storage lrwxrwxrwx. 1 root root 7 Sep 23 20:03 /dev/mapper/storage -> ../ dm-4 The formatting with a filesystem must also be performed on the device-mapper. Format the device-mapper with the ext4 filesystem as follows: [root@nodeorcl1 mapper]# mkfs.ext4 /dev/mapper/storage mke2fs 1.41.12 (17-May-2010) Filesystem label= OS type: Linux Block size=4096 (log=2) Fragment size=4096 (log=2) ………………………………………………………………………………………………………… This filesystem will be automatically checked every 38 mounts or 180 days, whichever comes first. Use tune2fs -c or -i to override. [root@nodeorcl1 mapper]# Next we will configure the device-mapper /dev/mapper/storage for automatic mount during boot. Create a directory called storage that will be used as the mount point: [root@nodeorcl1 storage]# mkdir /storage The mapper-device /dev/mapper/storage can be mounted as a normal device: [root@nodeorcl1 storage]# mount /dev/mapper/storage /storage To make the mount persistent across reboots add /storage as the mount point for /dev/mapper/storage. First add the mapper-device name into /etc/crypttab: [root@nodeorcl1 storage]# echo "storage /dev/sdb1" > /etc/crypttab Add the complete mapper-device path, mount point, and filesystem type in /etc/fstab as follows: /dev/mapper/storage /storage ext4 defaults 1 2 Reboot the system: [root@nodeorcl1 storage]# shutdown –r now At boot sequence, the passphrase for /storage will be requested. If no passphrase is typed then the mapper device will be not mounted. How it works... Block device encryption is implemented to work below the filesystem level. Once the device is offline, the data appears like a large blob of random data. There is no way to determine what kind of filesystem and data it contains. There's more... To dump information about the encrypted device you should execute the following command: [root@nodeorcl1 dev]# cryptsetup luksDump /dev/sdb1 LUKS header information for /dev/sdb1 Version: 1 Cipher name: aes Cipher mode: cbc-essiv:sha256 Hash spec: sha1 Payload offset: 4096 MK bits: 256 MK digest: 2c 7a 4c 96 9d db 63 1c f0 15 0b 2c f0 1a d9 9b 8c 0c 92 4b MK salt: 59 ce 2d 5b ad 8f 22 ea 51 64 c5 06 7b 94 ca 38 65 94 ce 79 ac 2e d5 56 42 13 88 ba 3e 92 44 fc MK iterations: 51750 UUID: 21d5a994-3ac3-4edc-bcdc-e8bfbf5f66f1 Key Slot 0: ENABLED Iterations: 207151 Salt: 89 97 13 91 1c f4 c8 74 e9 ff 39 bc d3 28 5e 90 bf 6b 9a c0 6d b3 a0 21 13 2b 33 43 a7 0c f1 85 Key material offset: 8 AF stripes: 4000 Key Slot 1: DISABLED Key Slot 2: DISABLED Key Slot 3: DISABLED Key Slot 4: DISABLED Key Slot 5: DISABLED Key Slot 6: DISABLED Key Slot 7: DISABLED [root@nodeorcl1 ~]# Using filesystem encryption with eCryptfs The eCryptfs filesytem is implemented as an encryption/decryption layer interposed between a mounted filesystem and the kernel. The data is encrypted and decrypted automatically at filesystem access. It can be used for backup or sensitive files placement for transportable or fixed storage mediums. In this recipe we will install and demonstrate some of eCryptfs, capabilities. Getting ready All steps will be performed on nodeorcl1. How to do it... eCryptfs is shipped and bundled with the Red Hat installation kit. The eCryptfs package is dependent on the trouser package. As root user, first install the trouser package followed by installation of the ecryptfs-util package: [root@nodeorcl1 Packages]# rpm -Uhv trousers-0.3.4-4.el6.x86_64. rpm warning: trousers-0.3.4-4.el6.x86_64.rpm: Header V3 RSA/SHA256 Signature, key ID fd431d51: NOKEY Preparing... ###################################### ##### [100%] 1:trousers ###################################### ##### [100%] [root@nodeorcl1 Packages]# rpm -Uhv ecryptfs-utils-82-6.el6. x86_64.rpm warning: ecryptfs-utils-82-6.el6.x86_64.rpm: Header V3 RSA/SHA256 Signature, key ID fd431d51: NOKEY Preparing... ###################################### ##### [100%] 1:ecryptfs-utils ###################################### ##### [100%] Create a directory that will be mounted with the eCryptfs filesystem and set the oracle user as the owner: [root@nodeorcl1 ~]# mkdir /ecryptedfiles [root@nodeorcl1 ~]# chown -R oracle:oinstall /ecryptedfiles Mount /ecryptedfiles to itself using the eCryptfs filesystem. Use the default values for all options and use a strong passphrase as follows: [root@nodeorcl1 hashkeys]# mount -t ecryptfs /ecryptedfiles / ecryptedfiles Select key type to use for newly created files: 1) openssl 2) tspi 3) passphrase Selection: 3 Passphrase: lR%5_+KO}Pi_$2E Select cipher: 1) aes: blocksize = 16; min keysize = 16; max keysize = 32 (not loaded) 2) blowfish: blocksize = 16; min keysize = 16; max keysize = 56 (not loaded) 3) des3_ede: blocksize = 8; min keysize = 24; max keysize = 24 (not loaded) 4) cast6: blocksize = 16; min keysize = 16; max keysize = 32 (not loaded) 5) cast5: blocksize = 8; min keysize = 5; max keysize = 16 (not loaded) Selection [aes]: Select key bytes: 1) 16 2) 32 3) 24 Selection [16]: Enable plaintext passthrough (y/n) [n]: Enable filename encryption (y/n) [n]: y Filename Encryption Key (FNEK) Signature [d395309aaad4de06]: Attempting to mount with the following options: ecryptfs_unlink_sigs ecryptfs_fnek_sig=d395309aaad4de06 ecryptfs_key_bytes=16 ecryptfs_cipher=aes ecryptfs_sig=d395309aaad4de06 Mounted eCryptfs [root@nodeorcl1 hashkeys]# Switch to the oracle user and export the HR schema to /ecryptedfiles directory as follows: [oracle@nodeorcl1 ~]$ export NLS_LANG=AMERICAN_AMERICA.AL32UTF8 [oracle@nodeorcl1 ~]$ exp system file=/ecryptedfiles/hr.dmp owner=HR statistics=none Export: Release 11.2.0.3.0 - Production on Sun Sep 23 20:49:30 2012 Copyright (c) 1982, 2011, Oracle and/or its affiliates. All rights reserved. Password: Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production With the Partitioning, OLAP, Data Mining and Real Application Testing options Export done in AL32UTF8 character set and AL16UTF16 NCHAR character set About to export specified users ... …………………………………………………………………………………………………………….. . . exporting table LOCATIONS 23 rows exported . . exporting table REGIONS 4 rows exported . …………………………………………………………………………………………………….. . exporting post-schema procedural objects and actions . exporting statistics Export terminated successfully without warnings. [oracle@nodeorcl1 ~]$ If you open the hr.dmp file with the strings command, you will be able to see the content of the dump file: [root@nodeorcl1 ecryptedfiles]# strings hr.dmp | more ……………………………………………………………………………………………………………………………………….. CREATE TABLE "COUNTRIES" ("COUNTRY_ID" CHAR(2) CONSTRAINT "COUNTRY_ID_NN" NOT NULL ENABLE, "COUNTRY_NAME" VARCHAR2(40), "REGION_ID" NUMBER, CONSTRAINT "COUNTRY_C_ID_PK" PRIMARY KEY ("COUNTRY_ID") ENABLE ) ORGANIZATION INDEX PCTFREE 10 INITRANS 2 MAXTRANS 255 STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 FREELISTS 1 FREELIST GROUPS 1 BUFFER_POOL DEFAULT) TABLESPACE "EXAMPLE" NOLOGGING NOCOMPRESS PCTTHRESHOLD 50 INSERT INTO "COUNTRIES" ("COUNTRY_ID", "COUNTRY_NAME", "REGION_ ID") VALUES (:1, :2, :3) Argentina Australia Belgium Brazil Canada Next as root unmount /ecryptedfiles as follows: [root@nodeorcl1 /]# unmount /ecryptedfiles/ If we list the content of the /ecryptedfile directory now, we should see that the file name and content is encrypted: [root@nodeorcl1 /]# cd /ecryptedfiles/ [root@nodeorcl1 ecryptedfiles]# ls ECRYPTFS_FNEK_ENCRYPTED.FWbHZH0OehHS.URqPdiytgZHLV5txs- bH4KKM4Sx2qGR2by6i00KoaCBwE-- [root@nodeorcl1 ecryptedfiles]# [root@nodeorcl1 ecryptedfiles]# more ECRYPTFS_FNEK_ENCRYPTED. FWbHZH0OehHS.URqPdiytgZHLV5txs-bH4KKM4Sx2qGR2by6i00KoaCBwE-- ………………………………………………………………………………………………………………………………… 9$Eî□□KdgQNK□□v□□ S□□J□□□ h□□□ PIi'ʼn□□R□□□□□siP □b □`)3 □W □W( □□□□c!□□8□E.1'□R□7bmhIN□□--(15%) …………………………………………………………………………………………………………………………………. To make the file accessible again, mount the /ecryptedfiles filesystem by passing the same parameters and passphrase as performed in step 3. How it works... eCryptfs is mapped in the kernel Virtual File System ( VFS ), similarly with other filesystems such as ext3, ext4, and ReiserFS. All calls on a filesystem will go first through the eCryptfs mount point and then to the current filesystem found on the mount point (ext4, ext4, jfs, ReiserFS). The key used for encryption is retrieved from the user session key ring, and the kernel cryptographic API is used for encryption and decryption of file content. The communication with kernel is performed by the eCryptfs daemon. The file data content is encrypted for each file with a distinct randomly generated File Encryption Key ( FEK ); FEK is encrypted with File Encryption Key Encryption Key ( FEKEK ) resulting in an Encrypted File Encryption Key ( EFEK) that is stored in the header of file. There's more... On Oracle Solaris you can implement filesystem encryption using the ZFS built-in filesystem encryption capabilities. On IBM AIX you can use EFS.
Read more
  • 0
  • 0
  • 3104
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at €14.99/month. Cancel anytime
article-image-piwik-tracking-user-interactions
Packt
11 Oct 2012
15 min read
Save for later

Piwik: Tracking User Interactions

Packt
11 Oct 2012
15 min read
Tracking events with Piwik Many of you may be familiar with event tracking with Google Analytics; and many of you may not. In Google Analytics, event tracking is pretty structured. When you track an event with Google, you get five parameters: Category: The name for the group of objects you want to track Action: A string that is used to define the user in action for the category of object Label: This is optional and is for additional data Value: This is optional and is used for providing addition numerical data Non-interaction: This is optional and will not add the event hit into bounce rate calculation if set to true We are going over a few details on event tracking with Google Analytics because the custom variable feature we will be using for event tracking in Piwik is a little less structured. And a little structure will help you drill down the details of your data more effectively from the start. You won't have to restructure and change your naming conventions later on and lose all of your historical data in the process. We don't need to look over the code for Google Analytics. Just know that it may help to set up your event tracking with a similar structure. If you had videos on your site, enough to track, you would most likely make a category of events Videos. You can create as many as you need for the various objects you want to track on your site: Videos Maps Games Ads Blog posts (social media actions) Products As for the actions that can be performed on those Videos, I can think of a few: Play Pause Stop Tweet Like Download There are probably more than you can think of, but now that we have these actions we can connect them with elements of the site. As for the label parameter, you would probably want to use that to store the title of the movie the visitor is interacting with or the page it is located on. We will skip the value parameter which is for numeric data because with Piwik, you won't have a similar value. But non-interaction is interesting; it means that by default an action on a page counts to lower the bounce rate from that page since the user is doing something. Unfortunately, this is not a feature that we have using Piwik currently, although that could change in the future. Okay, now that we have learned one of the ways to structure our events, let's look at the way we can track events in Piwik. There is really nothing called event tracking in Piwik, but Piwik does have custom variables which will do the same job. But, since it is not really event tracking in the truest sense of the word, the bounce rate will be unaffected by any of the custom variables collected. In other words, unlike Google Analytics, you don't get the non-interaction parameter you can set. But let's see what you can do with Piwik. Custom variables are name-value pairs that you can set in Piwik. You can assign up to five custom variables for each visitor and/or each page view. The function for setting a custom variable is setCustomVariable . You must call it for each custom variable you set up to the limit of five. piwikTracker.setCustomVariable (index, name, value, scope); And here are what you set the parameters to: index: This is a number from 1 to 5 where your custom variables are stored. It should stay the same as the name of custom variable. Changing the index of a name later will reset all old data. name: This is the custom variable name or key. value: This is the value for name. scope: This parameter sets the scope of the custom variable, whether it is being tracked per visit or per page. And what scope we set depends upon what we are tracking and how complex our site is. So how do these custom variables fit our model of event tracking? Well we have to do things a little bit differently. For most of our event tracking, we will have to set our variable scope per page. There is not enough room to store much data at the visit level. That is good for other custom tracking you may need but for event tracking you will need more space for data. So with page level custom variables, you get five name-value sets per page. So, we would set up our variables similar to something like this for a video on the page: Index = 1 Name = "Video" Value = "Play","Pause","Stop","Tweet","Like", and so on Scope = "page" And this set of variables in using Piwik's custom variable function would look like one of the following: piwikTracker.setCustomVariable(1,"Video","Play","page"); piwikTracker.setCustomVariable(1,"Video","Pause","page"); piwikTracker.setCustomVariable(1,"Video","Tweet","page"); Which one you would use would depend on what action you are tracking. You would use JavaScript in the page to trigger these variables to be set, most likely by using an onClick event on the button. We will go into the details of various event tracking scenarios later in this chapter. You will notice in the previous snippets of code that the index value of each call is 1. We have set the index of the "Video" name to 1 and must stick to this now on the page or data could be overwritten. This also leaves us the two to five indexes still available for use on the same page. That means if we have banner ads on the page, we could use one of the spare indexes to track the ads. piwikTracker.setCustomVariable(2,"SidebarBanner","Click","page"); You will notice that Google event tracking has the label variable. As we are using page leveling custom variables with Piwik and the variables will be attached to the page itself, there is no need to have this extra variable in most cases. If we do need to add extra data other than an action value, we will have to concatenate our data to the action and use the combined value in Piwik's custom tracking's value variable. Most likely, if we have one banner on our video page, we will have more and to track those click events per banner, we may have to get a little creative using the following: piwikTracker.setCustomVariable(2,"SidebarBanner", "AddSlot1Click","page"); piwikTracker.setCustomVariable(2,"SidebarBanner", "AddSlot2Click","page"); piwikTracker.setCustomVariable(2,"SidebarBanner", "AddSlot3Click","page"); Of course, it is up to you whether you join your data together by using CamelCase, which means joining each piece of data together after capitalizing each. This is what I did previously. You can also use spaces or underscores as long as it is understandable to you and you stick to it. Since the name and value are in quotation marks, you can use any suitable string. And again, since these are custom variables, if you come up with a better system of setting up your event tracking that works better with your website and business model, then by all means try it. Whatever works best for you and your site is better in the long run. So now that we have a general idea of how we will be tracking events with Piwik, let's look at some specific examples and more in depth at what events are, compared to goals or page views. Tracking social engagement You know that you have a Facebook "Like" button on your page, a Twitter "tweet" button, and possibly lots more buttons that do various things at other sites that you yourself have no control over and can add no tracking code to. But you can track clicks on the button itself. You use event tracking for what you could call micro-conversions. But there is really nothing micro about them. That Facebook "Like" could end up in many more sales or conversions than a standard conversion. They could be the route on the way to one or multiple conversions. There may be a blurry line between engagement goals and micro-conversions. And really, it is up to you what weight you give to visitor actions on your site, but use events for something smaller than you would consider a goal. If your goal is sales on your website, that Facebook "Like" should cause a spike in your sales and you will be able to correlate that to your event tracking, but the "Like" is not the end of the road, or the goal. It is a stop on the way. If your website is a blog and your goal is to promote advertising or your services with your content, then tracking social engagement can tell you which topics have the highest social interest so that you can create similar content in the future. So what are some other events we can track? Of course, you would want to track anything having to do with liking, tweeting, bookmarking, or somehow spreading your site on a social network. That includes Facebook , Twitter , Digg , StumbleUpon , Pinterest , and any other social network whose button you put on your site. If you spent enough time to put the buttons on your pages, you can at least track these events. And if you don't have buttons, you have to remember that each generation is using the Internet more often; smartphones make it available everywhere, and everyone is on a social network. Get with it. And don't forget to add event tracking to any sort of Follow Me or Subscribe button. That too is an event worth tracking. We will also look at blog comments since we can consider them to be in the social realm of our tracking. Tracking content sharing So let's look at a set of social sharing buttons on our website. We aren't going to blow things out of proportion by using buttons for every social network out there, just two: Twitter and Facebook. Your site may have less and should have more, but the same methods we will explore next can be used for any amount of content sharing buttons. We are event tracking, so let's begin by defining what our custom variable data will be. We need to figure out how we are going to set up our categories of events and the actions. In this example, we will be using buttons on our Holy Hand Grenade site: You will see our Twitter button and our Facebook button right underneath the image of our Holy Hand Grenade. We are going to act as if our site has many more pages and events to track on it and use a naming convention that will leave room for growth. So we are going to use the category of product shares. That way we have room for video shares when we finally get that cinematographer and film our Holy Hand Grenade in action. Now we need to define our actions. We will be adding more buttons later after we test the effectiveness of our Facebook and Twitter buttons. This means we need a separate action to distinguish each social channel. Share on Facebook Share on Twitter And then we add more buttons: Share on Google+ Share on Digg Share on Reddit Share on StumbleUpon So let's look at the buttons in the source of the page for a minute to see what we are working with: <li class="span1"> <script type="text/javascript" src="http://platform.twitter.com/widgets.js"></script> <a href="https://twitter.com/intent/tweet?url=http%3A%2F%2Fdir23.com&text=Holy%20Hand%20Grenades"class="twitter-share-button">Tweet</a> </li> <li class="span1"></li> <li class="span1"> <script src="http://connect.facebook.net/en_US/all.js#xfbml=1"></script> <fb:like href="http://dir23.com/" show_faces="false"width="50" font=""></fb:like> </li> </ul> <p><a class="btn btn-primary btn-large">Buy Now >></a></p> You see that the buttons are not really buttons yet, they are only HTML anchors in the code and JavaScript includes. Before we start looking at the code to track clicks on these buttons, we need to go over some details about the way Piwik's JavaScript works. Setting a custom variable in Piwik using an onclick event is a very tricky procedure. To start with, you must call more than just setCustomVariable because that will not work after the Piwik tracking JavaScript has loaded and trackPageView has been called. But there is a way around this. First, you call setCustomVariable and then, in that same onclick event , you call trackLink , as in the next example: <p><a href="buynow.html" class="btn btn-primary btn-large" onclick="javascript:piwikTracker.setCustomVariable(2,'Product Pricing','ViewProduct Price','page');piwikTracker.trackLink();">Buy Now >></a></p> If you forget to add the piwikTracker.trackLink() call, nothing will happen and no custom variables will be set. Now with the sharing buttons, we have another issue when it comes to tracking clicks. Most of these buttons, including Facebook, Twitter, and Google+ use JavaScript to create an iframe that has the button. This is a problem, because the iframe is on another domain and there is not an easy way to track clicks. For this reason, I suggest using your social network's API functionality to create the button so that you can create a callback that will fire when someone likes or tweets your page. Another advantage to this method is that you will be sure that each tracked tweet or like will be logged accurately. Using an on_click event will cause a custom variable to be created with every click. If the person is not logged in to their social account at the time, the actual tweet or like will not happen until after they log in, even if they decide to do so. Facebook, Twitter, and Google+ all have APIs with this functionality. But if you decide to try to track the click on the iframe, you can take a look at the code at http://www.bennadel.com/blog/1752-Tracking-Google-AdSense- Clicks-With-jQuery-And-ColdFusion.htm to see how complicated it can get. The click is not really tracked. The blur on the page is tracked, because blur usually happens if a link in the iframe is clicked and a new page is about to load. We already have our standard Piwik tracking code on the page. This does not have to be modified in any way for event tracking. Instead we will be latching into Twitters and Facebook's APIs which we loaded in the page by including their JavaScript. <script> twttr.events.bind('tweet', function(event) { piwikTracker.setCustomVariable(1,'Product Shares','Share on Twitter','page'); piwikTracker.trackLink(); }); </script> <script type="text/javascript"> FB.Event.subscribe('edge.create', function(response) { piwikTracker.setCustomVariable(1,'Product Shares','Share onFacebook','page'); piwikTracker.trackLink(); }); </script> We add these two simple scripts to the bottom of the page. I put them right before the Piwik tracking code. The first script binds to the tweet event in Twitter's API and once that event fires, our Piwik code executes and sets our custom variable. Notice that here too we have to call trackLink right afterwards. The second script does the same thing when someone likes the page on Facebook. It is beyond the scope of this book to go into more details about social APIs, but this code will get you started and you can do more research on your chosen social network's API on your own to see what type of event tracking will be possible. For example, with the Twitter API you can bind a function to each one of these actions: click, tweet, retweet, favorite, or follow. There are definitely more possibilities with this than there is with a simple onclick event. Using event tracking on your social sharing buttons will let you know where people share your line of Holy Hand Grenades. This will help you figure out just which social networks you should have a presence on. If people on Twitter like the grenades, then you should make sure to keep your Twitter account active, and if you don't have a Twitter account and your product is going viral there, you need to get one quick and participate in the conversation about your product. Or you may want to invest in the development of a Facebook app and you are not quite sure that it is worth the investment. Well, a little bit of event tracking will tell you if you have enough people interested in your website or products to fork over the money for an app. Or maybe a person goes down deep into the pages of your site, digs out a gem, and it gets passed around StumbleUpon like crazy. This might indicate a page that you should feature on the home page of your website. And if it's a product page that's been hidden from light for years, maybe throw some advertising its way too.
Read more
  • 0
  • 0
  • 2473

article-image-overview-fim-2010-r2
Packt
03 Sep 2012
18 min read
Save for later

Overview of FIM 2010 R2

Packt
03 Sep 2012
18 min read
The following picture shows a high-level overview of the FIM family and the components relevant to an FIM 2010 R2 implementation:     Within the FIM family, there are some parts that can live by themselves and others that depend on other parts. But, in order to fully utilize the power of FIM 2010 R2, you should have all parts in place. At the center, we have FIM Service and FIM Synchronization Service (FIM Sync). The key to a successful implementation of FIM 2010 R2 is to understand how these two components work—by themselves as well as together.   The history of FIM 2010 R2 Let us go through a short summary of the versions preceding FIM 2010 R2. In 1999, Microsoft bought a company called Zoomit. They had a product called VIA —a directory synchronization product. Microsoft incorporated Zoomit VIA into Microsoft Metadirectory Services (MMS). MMS was only available as a Microsoft Consulting Services solution. In 2003, Microsoft released Microsoft Identity Integration Server (MIIS), and this was the first publicly available version of the synchronization engine today known as FIM 2010 R2 Synchronization Service. In 2005, Microsoft bought a company called Alacris. They had a product called IdNexus, which was used to manage certificates and smart cards. Microsoft renamed it Certificate Lifecycle Manager (CLM). In 2007, Microsoft took MIIS (now with Service Pack 2) and CLM and slammed them together into a new product called Identity Lifecycle Manager 2007 (ILM 2007). Despite the name, ILM 2007 was basically a directory synchronization tool with a certificate management side-kicker. Finally, in 2010, Microsoft released Forefront Identity Manager 2010 (FIM 2010). FIM 2010 was a whole new thing, but as we will see, the old parts from MIIS and CLM are still there. The most fundamental change in FIM 2010 was the addition of the FIM Service component. The most important news was that FIM Service added workflow capability to the synchronization engine. Many identity management operations that used to require a lot of coding were suddenly available without a single line of code. In FIM 2010 R2, Microsoft added the FIM Reporting component and also made significant improvements to the other components.   FIM Synchronization Service (FIM Sync) FIM Synchronization Service is the oldest member of the FIM family. Anyone who has worked with MIIS back in 2003 will feel quite at home with it. Visually, the management tools look the same. FIM Synchronization Service can actually work by itself, without any other component of FIM 2010 R2 being present. We will then basically get the same functionality as MIIS had, back in 2003. FIM Synchronization Service is the heart of FIM, which pumps the data around, causing information about identities to flow from one system to another. Let's look at the pieces that make up the FIM Synchronization Service:     As we can see, there are lots of acronyms and concepts that need a little explaining. On the right-hand side of FIM Synchronization Service, we have Metaverse (MV). Metaverse is used to collect all the information about all the identities managed by FIM. On the other side, we have Connected Data Source (CDS). Connected Data Source is the database, directory, and file, among others, that the synchronization service imports information regarding the managed identities from, and/or exports this information to. To talk to different kinds of Connected Data Sources, FIM Synchronization Service uses adapters that are called Management Agents (MA). In FIM 2010 R2, we will start to use the term Connectors, instead. But, as the user interface in FIM Synchronization Manager still uses the term Management Agent The Management Agent stores a representation of the objects in the CDS, in its Connector Space (CS). When stored in the Connector Space, we refer to the objects as holograms. If we were to look into this a little deeper, we would find that the holograms (objects) are actually stored in multiple instances so that the Management Agent can keep a track of the changes to the objects in the Connector Space. In order to synchronize information from/to different Connected Data Sources, we connect the objects in the Connector Space with the corresponding object in the Metaverse. By collecting information from all Connected Data Sources, the synchronization engine aggregates the information about the object from all the Connected Data Sources into the Metaverse object. This way, the Metaverse will only contain one representation of the object (for example, a user). To describe the data flow within the synchronization service, let's look at the previous diagram and follow a typical scenario. The scenario is this—we want information in our Human Resource (HR) system to govern how users appear in Active Directory (AD) and in our e-mail system. Import users from HR: The bottom CDS could be our HR system. We configure a Management Agent to import users from HR to the corresponding CS. Projection to Metaverse: As there is no corresponding user in the MV that we can connect to, we tell the MA to create a new object in the MV. The process of creating new objects in the MV is called Projection. To transfer information from the HR CS to the MV, we configure Inbound Synchronization Rules. Import and join users from AD: The middle CDS could be Active Directory (AD). We configure a Management Agent to import users from AD. Because there are objects in the MV, we can now tell the Management Agent to try to match the user objects from AD to the objects in the MV. Connecting existing objects in a Connector Space, to an existing object in the Metaverse, is called Joining. In order for the synchronization service to know which objects to connect, some kind of unique information must be present, to get a one-to-one mapping between the object in the CS and the object in the Metaverse. Synchronize information from HR to AD: Once the Metaverse object has a connector to both the HR CS and the AD CS, we can move information from the HR CS to the AD CS. We can, for example, use the employee status information in the HR system to modify the userAccountControl attribute of the AD account. In order to modify the AD CS object, we configure an Outbound Synchronization rule that will tell the synchronization service how to update the CS object based on the information in the MV object. Synchronizing, however, does not modify the user object in AD; it only modifies the hologram representation of the user in the AD Connector Space. Export information to AD: In order to actually change any information in a Connected Data Source, we need to tell the MA to export the changes. During export, the MA updates the objects in the CDS with the changes it has made to the hologram in the Connector Space. Provision users to the e-mail system: The top CDS could be our e-mail system. As users are not present in this system, we would like the synchronization service to create new objects in the CS for the e-mail system. The process of creating new objects in a Connector Space is called Provisioning. Projection, Joining, and Provisioning all create a connector between the Metaverse object and the Connector Space object, making it possible to synchronize identity information between different Connected Data Sources. A key concept to understand here, is that we do not configure synchronization between Connected Data Sources or between Connector Spaces. We synchronize between each Connector Space and Metaverse. Looking at the previous example, we can see that when information flows from HR to AD, we configure the following: HR MA to Import data to the HR CS Inbound synchronization from the HR CS to the MV Outbound synchronization from the MV to the AD CS AD MA to Export the data to AD   Management Agents Management Agents, or Connectors as some people call them, are the entities that enable FIM to talk to different kinds of data sources. Basically, we can say that FIM can talk to any type of data source, but it only has built-in Management Agents for some. If the data source is really old, we might even have to use the extensibility platform and write our own Management Agent or buy a Management Agent from a third-party supplier. At http://aka.ms/FIMPartnerMA, we can find a list of Management Agents supplied by Microsoft Partners. For a complete list of Management Agents built in and available from Microsoft, please look at http://aka.ms/FIMMA. With R2, a new Management Agent for Extensible Connectivity 2.0 (ECMA 2.0) is released, introducing new ways of making custom Management Agents. We will see updated versions of most third party Management Agents as soon as they are migrated to the new ECMA 2.0 platform. Microsoft will also ship new Management Agents using the new ECMA 2.0 platform. Writing our own MA is one way of solving problems communicating with odd data sources. But there might be other solutions to the problem that will require less coding.   Non-declarative vs. declarative synchronization If you are using FIM Synchronization Service the old way, like we did in MIIS or ILM 2007, it is called non-declarative synchronization. We usually call that classic synchronization and will also use that term in this article. If we use the FIM Service logic to control it all, it is called declarative synchronization. As classic synchronization usually involves writing code, and declarative does not; we will also find references calling declarative synchronization codeless. In fact, it was quite possible, in some scenarios, to have codeless synchronization— even in the old MIIS or ILM 2007—using classic synchronization. The fact also remains that there are very few FIM 2010 R2 implementations that are indeed code free. In some cases you might even mix the two. This could be due either to migration from MIIS/ILM 2007 to FIM 2010 R2 or to the decision that it is cheaper/ quicker/easier to solve a particular problem using classic synchronization.   Password synchronization This should be the last resort to achieve some kind of Single Sign On (SSO). Instead of implementing password synchronization, we try to make our customers look at other ways, such as Kerberos or Federation, to get SSO. There are, however, many cases where password synchronization is the best option to maintain passwords in different systems. Not all environments can utilize Kerberos or Federation and therefore need the FIM password synchronization feature to maintain passwords in different Connected Data Sources. The use of this feature is to have Active Directory by either installing and configuring Password Change Notification Service (PCNS) on Domain Controllers or using FIM Service as a source for the password change. FIM Synchronization Service then updates the password on the connected object in Connected Data Sources, which are configured as password synchronization targets. In order for FIM to set the password in a target system, the Management Agent used to connect to that specific CDS needs to support this. Most Management Agents available today support password management or can be configured to do so.   FIM Service Management Agent A very special Management Agent is the one connecting FIM Synchronization Service to FIM Service. Many of the rules we apply to other types of Management Agents do not apply to this one. If you have experience working with classic synchronization in MIIS or ILM 2007, you will find that this Management Agent does not work as the others.   FIM Service If FIM Synchronization Service is the heart pumping information, FIM Service is the brain (sorry FIM CM, but your brain is not as impressive) FIM Service plays many roles in FIM, and during the design phase the capabilities of FIM Service is often on focus. FIM Service allows you to enforce the Identity Management policy within your organization and also make sure you are compliant at all times. FIM Service has its own database, where it stores the information about the identities it manages.   Request pipeline In order to make any changes to objects in the FIM Service database, we need to work our way through the FIM Service request pipeline. So, let's look at the following diagram and walk through the request pipeline:     Every request is made to the web service interface, and follows the ensuing flow: The Request Processor workflow receives the request and evaluates the token (who?) and the request type (what?). Permission is checked to see if the request is allowed. Management Policy Rules are evaluated. If Authenticate workflow is required, serialize and run interactive workflow. If Authorize workflow is required, parallelize and run asynchronous workflow. Modify the object in FIM Service Database according to the request. If Action workflow is required, run follow-up workflows. As we can see, a request to FIM Service may trigger three types of workflows. With the installation of FIM 2001 R2, we will get a few workflows that will cover many basic requirements, but this is one of the situations where custom coding or thirdparty workflows might be required in order to fulfill the identity management policy within the organization. Authentication workflow (AuthN) is used when the request requires additional authentication. An example of this is when a user tries to reset his password—the AuthN workflow will ask the anonymous user to authenticate using the QA gateway. Authorization workflow (AuthZ) is used when the request requires authorization from someone else. An example of this is when a user is added to a group, but the policy states that the owner of the group needs to approve the request. Action workflow is used for many types of follow-up actions—it could be sending a notification email or modifying attributes, among many other things.   FIM Service Management Agent FIM Service Management Agent , as we discussed earlier, is responsible for synchronizing data between FIM Service and FIM Synchronization Service. We said then that this MA is a bit special, and even from the FIM Service perspective it works a little differently. A couple of examples of the special relationship between the FIM Service MA and FIM Service are as follows: Any request made by the FIM Service MA will bypass any AuthN and AuthZ workflows As a performance enhancer, the FIM Service MA is allowed to make changes directly to the FIM Service DB in FIM 2010 R2, without using the request pipeline described earlier   Management Policy Rules (MPRs) The way we control what can be done, or what should happen, is by defining Management Policy Rules (MPRs) within FIM Service. MPR is our tool to enforce the Identity Management policies within our organization. There are two types of MPRs—Request and Set Transition. A Request MPR is used to define how the request pipeline should behave on a particular request. If a request comes in and there is no Request MPR matching the request, it will fail. A Set Transition MPR is used to detect changes in objects and react upon that change. For example, if my EmployeeStatus is changed to Fired, my Active Directory (AD) account should be disabled. A Set is used within FIM Service to group objects. We define rules that govern the criteria for an object to be part of a Set. For example, we can create a Set, which contains all users with Fired as EmployeeStatus. As objects satisfy this criteria and transition in to the Set, we can define a Set Transition MPR to make things such as disabling the AD account happen. We can also define an MPR that applies to the transition out from a Set. The Sets are also used to configure permissions within FIM Service. Using Sets allows us to configure very granular permissions in scenarios where FIM Service is used for user self service.   FIM Portal FIM Portal is usually the starting point for administrators who will configure FIM Service. The configuration of FIM Service is usually done using FIM Portal, but it may also be configured using Power Shell or even your own custom interface. FIM Portal can also be used for self-service scenarios, allowing users to manage some aspect of the Identity Management process. FIM Portal is actually an ASP.NET application using Microsoft Sharepoint as a foundation, and can be modified in many ways.   Self Service Password Reset (SSPR) The Self Service Password Reset (SSPR) feature of FIM is a special case, where most components used to implement it are built-in. The default method is using what is called a QA Gate. FIM 2010 R2 also has built-in methods for using a One Time Password (OTP) that can be sent using either SMS, or e-mail services. In short, the QA Gate works in the following way: The administrator defines a number of questions. Users register for SSPR and provide answers to the questions. Users are presented with the same questions, when a password reset is needed. Giving the correct answers identifies the user and allows them to reset their password.     Once the FIM administrator has used FIM Portal to configure the password reset feature, the end user can register his answers to QA Gate. If the organization has deployed FIM Password Reset Extension to the end user's Windows client, the process of registration and reset can be made directly from the Windows client. If not, the user can register and reset his password using the password registration and reset portals.   FIM Reporting The Reporting component is brand new in FIM 2010 R2. In earlier versions of FIM, as well as the older MIIS and ILM, reporting was typically achieved by either buying third-party add-ons or developing their own solutions based on SQL Reporting Services. The purpose of Reporting is to give you a chance to view historical data. There are a few reports built in to FIM 2010 R2, but many organizations will develop their own reports that comply with their Identity Management policies. The implementation of FIM 2010 R2 will however be a little more complex, if you want the Reporting component. This is because the engine used to generate the reports is the Data Warehouse component of Microsoft System Center Service Manager (SCSM). There are a number of reasons for using the existing reporting capabilities in SCSM; the main one is that it is easy to extend.   FIM Certificate Management (FIM CM) Certificate Management is the outcast member of the FIM family. FIM CM can be, and often is, used by itself, without any other parts of FIM being present. It is also the component with the poorest integration with the other components. If we look at it, we will find that it hasn't changed much since its predecessor, Certificate Lifecycle Management (CLM), was released. FIM CM is mainly focused on managing smart cards, but it can also be used to manage and trace any type of certificate requests.     The basic concept of FIM CM is that a smart card is requested using the FIM CM portal. Information regarding all requests is stored in the FIM CM database. The Certification authority, which handles the issuing of the certificates, is configured to report the status back to the FIM CM database. FIM CM portal also contains a workflow engine, so that the FIM CM admin can configure features such as e-mail notifications as a part of the policies.   Certificate Management portal FIM Certificate Management uses a portal to interact with users and administrators. The FIM CM portal is an ASP.Net 2.0 website where, for example: Administrators can configure the policies that govern the processes around certificate management End users can manage their smart cards for purposes such as renewing and changing PIN codes Help desks can use the portal to, for example, request temporary smart cards or reset PINs:     Licensing We put this part in here, not to tell you how FIM 2010 R2 is licensed, but rather to tell you that it is complex. Since Microsoft has a habit of changing the way they license their products, we will not put any license details into writing. Depending on what parts you are using and, in some cases, how you are using them, you need to buy different licenses. FIM 2010 R2 (at the time of my writing) uses both Server licenses as well as Client Access Licenses (CALs). In almost every FIM project the licensing cost is negligible compared to the gain retrieved by implementing it. But even so, please make sure to contact your Microsoft licensing partner, or your Microsoft contact, to clear any questions you might have around licensing. If you do not have Microsoft System Center Service Manager (SCSM), it is stated (at the time of my writing) that you can install and use SCSM for FIM Reporting usage without having to buying SCSM licenses. Read more about FIM Licensing at o http://aka.ms/FIMLicense.   Summary As it can be seen, Microsoft Forefront Identity Manager 2010 R2 is not just one product, but a family of products. In this article, we have given you a short overview of the different components, and we saw how together they can mitigate the challenges that The Company has identified about their identity management.
Read more
  • 0
  • 0
  • 3347

article-image-infinispan-data-grid-infinispan-and-jboss-7
Packt
23 Aug 2012
2 min read
Save for later

Infinispan Data Grid: Infinispan and JBoss AS 7

Packt
23 Aug 2012
2 min read
The new modular application server JBoss AS has changed a lot with the latest distribution. The new application server has improved in many areas, including lower memory footprint, lightning fast startup, true classloading isolation (between built-in modules and modules delivered by developers), and excellent management of resources with the addition of domain controllers. How Infinispan platform fits into this new picture will be illustrated shortly. Should you need to know all the core details of the AS 7 architecture, you might consider looking for a copy of JBoss AS 7 Configuration, Deployment and Administration, which has been authored by Francesco Marchioni and was published in December, 2011. In a nutshell, JBoss AS 7 is composed of a set of modules that provide the basic server functionalities. The configuration of modules is not spread in a set of single XML files anymore, but it is centralized into a single file. Thus, every configuration file holds a single server configuration. A server configuration can be in turn based on a set of standalone servers or domain servers. The main difference between standalone servers and domain servers encompasses the management area; as a matter of fact, domain-based servers can be managed from a centralized point (the domain controller) while, on the other hand, standalone servers are independent server units, each one managing its own configuration.
Read more
  • 0
  • 0
  • 1951

article-image-hbase-administration-performance-tuning
Packt
21 Aug 2012
8 min read
Save for later

HBase Administration, Performance Tuning

Packt
21 Aug 2012
8 min read
Setting up Hadoop to spread disk I/O Modern servers usually have multiple disk devices to provide large storage capacities. These disks are usually configured as RAID arrays, as their factory settings. This is good for many cases but not for Hadoop. The Hadoop slave node stores HDFS data blocks and MapReduce temporary files on its local disks. These local disk operations benefit from using multiple independent disks to spread disk I/O. In this recipe, we will describe how to set up Hadoop to use multiple disks to spread its disk I/O. Getting ready We assume you have multiple disks for each DataNode node. These disks are in a JBOD (Just a Bunch Of Disks) or RAID0 configuration. Assume that the disks are mounted at /mnt/d0, /mnt/d1, …, /mnt/dn, and the user who starts HDFS has write permission on each mount point. How to do it... In order to set up Hadoop to spread disk I/O, follow these instructions: On each DataNode node, create directories on each disk for HDFS to store its data blocks: hadoop$ mkdir -p /mnt/d0/dfs/datahadoop$ mkdir -p /mnt/d1/dfs/data…hadoop$ mkdir -p /mnt/dn/dfs/data Add the following code to the HDFS configuration file (hdfs-site.xml): hadoop@master1$ vi $HADOOP_HOME/conf/hdfs-site.xml <property> <name>dfs.data.dir</name> <value>/mnt/d0/dfs/data,/mnt/d1/dfs/data,...,/mnt/dn/dfs/data</value> </property> Sync the modified hdfs-site.xml file across the cluster: hadoop@master1$ for slave in `cat $HADOOP_HOME/conf/slaves`do rsync -avz $HADOOP_HOME/conf/ $slave:$HADOOP_HOME/conf/done Restart HDFS: hadoop@master1$ $HADOOP_HOME/bin/stop-dfs.shhadoop@master1$ $HADOOP_HOME/bin/start-dfs.sh How it works... We recommend JBOD or RAID0 for the DataNode disks, because you don't need the redundancy of RAID, as HDFS ensures its data redundancy using replication between nodes. So, there is no data loss when a single disk fails. Which one to choose, J BOD or RAID0? You will theoretically get better performance from a JBOD configuration than from a RAID configuration. This is because, in a RAID configuration, you have to wait for the slowest disk in the array to complete before the entire write operation can complete, which makes the average I/O time equivalent to the slowest disk's I/O time. In a JBOD configuration, operations on a faster disk will complete independently of the slower ones, which makes the average I/O time faster than the slowest one. However, enterprise-class RAID cards might make big differences. You might want to benchmark your JBOD and RAID0 configurations before deciding which one to go with. For both JBOD and RAID0 configurations, you will have the disks mounted at different paths. The key point here is to set the dfs.data.dirproperty to all the directories created on each disk. The dfs.data.dirproperty specifies where the DataNode should store its local blocks. By setting it to comma-separated multiple directories, DataNode stores its blocks across all the disks in round robin fashion. This causes Hadoop to efficiently spread disk I/O to all the disks. Warning Do not leave blanks between the directory paths in the dfs.data.dir property value, or it won't work as expected. You will need to sync the changes across the cluster and restart HDFS to apply them. There's more... If you run MapReduce, as MapReduce stores its temporary files on TaskTracker's local file system, you might also like to set up MapReduce to spread its disk I/O: On each TaskTracker node, create directories on each disk for MapReduce to store its intermediate data files: hadoop$ mkdir -p /mnt/d0/mapred/localhadoop$ mkdir -p /mnt/d1/mapred/local…hadoop$ mkdir -p /mnt/dn/mapred/local Add the following to MapReduce's configuration file (mapred-site.xml): hadoop@master1$ vi $HADOOP_HOME/conf/mapred-site.xml <property> <name>mapred.local.dir</name> <value>/mnt/d0/mapred/local,/mnt/d1/mapred/local,...,/mnt/dn/mapred/local</value> </property> Sync the modified mapred-site.xml file across the cluster and restart MapReduce. MapReduce generates a lot of temporary files on TaskTrackers' local disks during its execution. Like HDFS, setting up multiple directories on different disks helps spread MapReduce disk I/O significantly. Using network topology script to make Hadoop rack-aware Hadoop has the concept of "Rack Awareness ". Administrators are able to define the rack of each DataNode in the cluster. Making Hadoop rack-aware is extremely important because: Rack awareness prevents data loss Rack awareness improves network performance In this recipe, we will describe how to make Hadoop rack-aware and why it is important. Getting ready You will need to know the rack to which each of your slave nodes belongs. Log in to the master node as the user who started Hadoop. How to do it... The following steps describe how to make Hadoop rack-aware: Create a topology.sh script and store it under the Hadoop configuration directory. Change the path for topology.data, in line 3, to fit your environment: hadoop@master1$ vi $HADOOP_HOME/conf/topology.sh while [ $# -gt 0 ] ; do nodeArg=$1 exec< /usr/local/hadoop/current/conf/topology.data result="" while read line ; do ar=( $line ) if [ "${ar[0]}" = "$nodeArg" ] ; then result="${ar[1]}" fi done shift if [ -z "$result" ] ; then echo -n "/default/rack " else echo -n "$result " fi done Don't forget to set the execute permission on the script file: hadoop@master1$ chmod +x $HADOOP_HOME/conf/topology.sh Create a topology.data file, as shown in the following snippet; change the IP addresses and racks to fit your environment: hadoop@master1$ vi $HADOOP_HOME/conf/topology.data10.161.30.108 /dc1/rack110.166.221.198 /dc1/rack210.160.19.149 /dc1/rack3 Add the following to the Hadoop core configuration file (core-site.xml): hadoop@master1$ vi $HADOOP_HOME/conf/core-site.xml <property> <name>topology.script.file.name</name> <value>/usr/local/hadoop/current/conf/topology.sh</value> </property> Sync the modified files across the cluster and restart HDFS and MapReduce. Make sure HDFS is now rack-aware. If everything works well, you should be able to find something like the following snippet in your NameNode log file: 2012-03-10 13:43:17,284 INFO org.apache.hadoop.net.NetworkTopology: Adding a new node: /dc1/rack3/10.160.19.149:50010 2012-03-10 13:43:17,297 INFO org.apache.hadoop.net.NetworkTopology: Adding a new node: /dc1/rack1/10.161.30.108:50010 2012-03-10 13:43:17,429 INFO org.apache.hadoop.net.NetworkTopology: Adding a new node: /dc1/rack2/10.166.221.198:50010 Make sure MapReduce is now rack-aware. If everything works well, you should be able to find something like the following snippet in your JobTracker log file: 2012-03-10 13:50:38,341 INFO org.apache.hadoop.net.NetworkTopology: Adding a new node: /dc1/rack3/ip-10-160-19-149.us-west-1.compute.internal 2012-03-10 13:50:38,485 INFO org.apache.hadoop.net.NetworkTopology: Adding a new node: /dc1/rack1/ip-10-161-30-108.us-west-1.compute.internal 2012-03-10 13:50:38,569 INFO org.apache.hadoop.net.NetworkTopology: Adding a new node: /dc1/rack2/ip-10-166-221-198.us-west-1.compute.internal How it works... The following diagram shows the concept of Hadoop rack awareness: Each block of the HDFS files will be replicated to multiple DataNodes, to prevent loss of all the data copies due to failure of one machine. However, if all copies of data happen to be replicated on DataNodes in the same rack, and that rack fails, all the data copies will be lost. So to avoid this, the NameNode needs to know the network topology in order to use that information to make intelligent data replication. As shown in the previous diagram, with the default replication factor of three, two data copies will be placed on the machines in the same rack, and another one will be put on a machine in a different rack. This ensures that a single rack failure won't result in the loss of all data copies. Normally, two machines in the same rack have more bandwidth and lower latency between them than two machines in different racks. With the network topology information, Hadoop is able to maximize network performance by reading data from proper DataNodes. If data is available on the local machine, Hadoop will read data from it. If not, Hadoop will try reading data from a machine in the same rack, and if it is available on neither, data will be read from machines in different racks. In step 1, we create a topology.sh script. The script takes DNS names as arguments and returns network topology (rack) names as the output. The mapping of DNS names to network topology is provided by the topology.data file, which was created in step 2. If an entry is not found in the topology.data file, the script returns /default/rack as a default rack name. Note that we use IP addresses, and not hostnames in the topology. data file. There is a known bug that Hadoop does not correctly process hostnames that start with letters "a" to "f". Check HADOOP-6682 for more details. In step 3, we set the topology.script.file.name property in core-site.xml, telling Hadoop to invoke topology.sh to resolve DNS names to network topology names. After restarting Hadoop, as shown in the logs of steps 5 and 6, HDFS and MapReduce add the correct rack name as a prefix to the DNS name of slave nodes. This indicates that the HDFS and MapReduce rack awareness work well with the aforementioned settings.
Read more
  • 0
  • 0
  • 3686
article-image-article-hbase-administration-performance-tuning-hadoop-0
Packt
21 Aug 2012
7 min read
Save for later

HBase Administration, Performance Tuning, Hadoop

Packt
21 Aug 2012
7 min read
Setting up Hadoop to spread disk I/O Modern servers usually have multiple disk devices to provide large storage capacities. These disks are usually configured as RAID arrays, as their factory settings. This is good for many cases but not for Hadoop. The Hadoop slave node stores HDFS data blocks and MapReduce temporary files on its local disks. These local disk operations benefit from using multiple independent disks to spread disk I/O. In this recipe, we will describe how to set up Hadoop to use multiple disks to spread its disk I/O. Getting ready We assume you have multiple disks for each DataNode node. These disks are in a JBOD (Just a Bunch Of Disks) or RAID0 configuration. Assume that the disks are mounted at /mnt/d0, /mnt/d1, …, /mnt/dn, and the user who starts HDFS has write permission on each mount point. How to do it... In order to set up Hadoop to spread disk I/O, follow these instructions: On each DataNode node, create directories on each disk for HDFS to store its data blocks: code 1 Add the following code to the HDFS configuration file (hdfs-site.xml): code 2 Sync the modified hdfs-site.xml file across the cluster: code 3 Restart HDFS: code 4 How it works... We recommend JBOD or RAID0 for the DataNode disks, because you don't need the redundancy of RAID, as HDFS ensures its data redundancy using replication between nodes. So, there is no data loss when a single disk fails. Which one to choose, J BOD or RAID0? You will theoretically get better performance from a JBOD configuration than from a RAID configuration. This is because, in a RAID configuration, you have to wait for the slowest disk in the array to complete before the entire write operation can complete, which makes the average I/O time equivalent to the slowest disk's I/O time. In a JBOD configuration, operations on a faster disk will complete independently of the slower ones, which makes the average I/O time faster than the slowest one. However, enterprise-class RAID cards might make big differences. You might want to benchmark your JBOD and RAID0 configurations before deciding which one to go with. For both JBOD and RAID0 configurations, you will have the disks mounted at different paths. The key point here is to set the dfs.data.dirproperty to all the directories created on each disk. The dfs.data.dirproperty specifies where the DataNode should store its local blocks. By setting it to comma-separated multiple directories, DataNode stores its blocks across all the disks in round robin fashion. This causes Hadoop to efficiently spread disk I/O to all the disks. Warning Do not leave blanks between the directory paths in the dfs.data.dir property value, or it won't work as expected. You will need to sync the changes across the cluster and restart HDFS to apply them. There's more... If you run MapReduce, as MapReduce stores its temporary files on TaskTracker's local file system, you might also like to set up MapReduce to spread its disk I/O: On each TaskTracker node, create directories on each disk for MapReduce to store its intermediate data files: code 5 Add the following to MapReduce's configuration file (mapred-site.xml): code 6 Sync the modified mapred-site.xml file across the cluster and restart MapReduce. MapReduce generates a lot of temporary files on TaskTrackers' local disks during its execution. Like HDFS, setting up multiple directories on different disks helps spread MapReduce disk I/O significantly. Using network topology script to make Hadoop rack-aware Hadoop has the concept of "Rack Awareness ". Administrators are able to define the rack of each DataNode in the cluster. Making Hadoop rack-aware is extremely important because: Rack awareness prevents data loss Rack awareness improves network performance In this recipe, we will describe how to make Hadoop rack-aware and why it is important. Getting ready You will need to know the rack to which each of your slave nodes belongs. Log in to the master node as the user who started Hadoop. How to do it... The following steps describe how to make Hadoop rack-aware: Create a topology.sh script and store it under the Hadoop configuration directory. Change the path for topology.data, in line 3, to fit your environment: code 7 Don't forget to set the execute permission on the script file: code 8 Create a topology.data file, as shown in the following snippet; change the IP addresses and racks to fit your environment: code 9 Add the following to the Hadoop core configuration file (core-site.xml): code 10 Sync the modified files across the cluster and restart HDFS and MapReduce. Make sure HDFS is now rack-aware. If everything works well, you should be able to find something like the following snippet in your NameNode log file: 2012-03-10 13:43:17,284 INFO org.apache.hadoop.net.NetworkTopology: Adding a new node: /dc1/rack3/10.160.19.149:50010 2012-03-10 13:43:17,297 INFO org.apache.hadoop.net.NetworkTopology: Adding a new node: /dc1/rack1/10.161.30.108:50010 2012-03-10 13:43:17,429 INFO org.apache.hadoop.net.NetworkTopology: Adding a new node: /dc1/rack2/10.166.221.198:50010 Make sure MapReduce is now rack-aware. If everything works well, you should be able to find something like the following snippet in your JobTracker log file: 2012-03-10 13:50:38,341 INFO org.apache.hadoop.net.NetworkTopology: Adding a new node: /dc1/rack3/ip-10-160-19-149.us-west-1.compute.internal 2012-03-10 13:50:38,485 INFO org.apache.hadoop.net.NetworkTopology: Adding a new node: /dc1/rack1/ip-10-161-30-108.us-west-1.compute.internal 2012-03-10 13:50:38,569 INFO org.apache.hadoop.net.NetworkTopology: Adding a new node: /dc1/rack2/ip-10-166-221-198.us-west-1.compute.internal How it works... The following diagram shows the concept of Hadoop rack awareness: Each block of the HDFS files will be replicated to multiple DataNodes, to prevent loss of all the data copies due to failure of one machine. However, if all copies of data happen to be replicated on DataNodes in the same rack, and that rack fails, all the data copies will be lost. So to avoid this, the NameNode needs to know the network topology in order to use that information to make intelligent data replication. As shown in the previous diagram, with the default replication factor of three, two data copies will be placed on the machines in the same rack, and another one will be put on a machine in a different rack. This ensures that a single rack failure won't result in the loss of all data copies. Normally, two machines in the same rack have more bandwidth and lower latency between them than two machines in different racks. With the network topology information, Hadoop is able to maximize network performance by reading data from proper DataNodes. If data is available on the local machine, Hadoop will read data from it. If not, Hadoop will try reading data from a machine in the same rack, and if it is available on neither, data will be read from machines in different racks. In step 1, we create a topology.sh script. The script takes DNS names as arguments and returns network topology (rack) names as the output. The mapping of DNS names to network topology is provided by the topology.data file, which was created in step 2. If an entry is not found in the topology.data file, the script returns /default/rack as a default rack name. Note that we use IP addresses, and not hostnames in the topology. data file. There is a known bug that Hadoop does not correctly process hostnames that start with letters "a" to "f". Check HADOOP-6682 for more details. In step 3, we set the topology.script.file.name property in core-site.xml, telling Hadoop to invoke topology.sh to resolve DNS names to network topology names. After restarting Hadoop, as shown in the logs of steps 5 and 6, HDFS and MapReduce add the correct rack name as a prefix to the DNS name of slave nodes. This indicates that the HDFS and MapReduce rack awareness work well with the aforementioned settings.
Read more
  • 0
  • 0
  • 804

article-image-article-ibm-cognos-10-bi-business-insight-dashboard
Packt
16 Jul 2012
7 min read
Save for later

IBM Cognos 10 BI dashboarding components

Packt
16 Jul 2012
7 min read
Introducing IBM Cognos 10 BI Cognos Connection In this recipe we will be exploring Cognos Connection, which is the user interface presented to the user when he/she logs in to IBM Cognos 10 BI for the first time. IBM Cognos 10 BI, once installed and configured, can be accessed through the Web using supported web browsers. For a list of supported web browsers, refer to the Installation and Configuration Guide shipped with the product. Getting ready As stated earlier, make sure that IBM Cognos 10 BI is installed and configured. Install and configure the GO Sales and GO Data Warehouse samples. Use the gateway URI to log on to the web interface called Cognos Connection. How to do it... To explore Cognos Connection, perform the following steps: Log on to Cognos Connection using the gateway URI that may be similar to http://<HostName>:<PortNumber>/ibmcognos/cgi-bin/cognos.cgi. Take note of the Cognos Connection interface. It has the GO Sales and GO Data Warehouse samples visible. Note the blue-colored folder icon, shown as in the preceding screenshot. It represents metadata model packages that are published to Cognos Connection using the Cognos Framework Manager tool. These packages have objects that represent business data objects, relationships, and calculations, which can be used to author reports and dashboards. Refer to the book, IBM Cognos TM1 Cookbook by Packt Publishing to learn how to create metadata models packages. From the toolbar, click on Launch. This will open a menu, showing different studios, each having different functionality, as shown in the following screenshot: We will use Business Insight and Business Insight Advanced, which are the first two choices in the preceding menu. These are the two components used to create and view dashboards. For other options, refer to the corresponding books by the same publisher. For instance, refer to the book, IBM Cognos 8 Report Studio Cookbook to know more about creating and distributing complex reports. Query Studio and Analysis Studio are meant to provide business users with the facility to slice and dice business data themselves. Event Studio is meant to define business situations and corresponding actions. Coming back to Cognos Connection, note that a yellow-colored folder icon, which is shown as represents a user-defined folder, which may or may not contain other published metadata model packages, reports, dashboards, and other content. In our case, we have a user-defined folder called Samples. This was created when we installed and configured samples shipped with the product. Click on the New Folder icon, which is represented by , on the toolbar to create a user-defined folder. Other options are also visible here, for instance to create a new dashboard.   Click on the user-defined folder—Samples to view its contents, as shown in the following screenshot: As shown in the preceding screenshot, it has more such folders, each having its own content. The top part of the pane shows the navigation path. Let's navigate deeper into Models | Business Insight Samples to show some sample dashboards, created using IBM Cognos Business Insight, as shown in the following screenshot: Click on one of these links to view the corresponding dashboard. For instance, click on Sales Dashboard (Interactive) to view the dashboard, as shown in the following screenshot: The dashboard can also be opened in the authoring tool, which is IBM Cognos Business Insight, in this case by clicking on the icon shown as on extreme right, on Cognos Connection. It will show the same result as shown in the preceding screenshot. We will see the Business Insight interface in detail later in this article. How it works... Cognos Connection is the primary user interface that user sees when he/she logs in for the first time. Business data has to be first identified and imported from the metadata model using the Cognos Framework Manager tool. Relationships (inner/outer joins) and calculations are then created, and the resultant metadata model package is published to the IBM Cognos 10 BI Server. This becomes available on Cognos Connection. Users are given access to appropriate studios on Cognos Connection, according to their needs. Analysis, reports, and dashboards are then created and distributed using one of these studios. The preceding sample has used Business Insight, for instance. Later sections in this article will look more into Business Insight and Business Insight Advanced. The next section focuses on the Business Insight interface details from the navigation perspective. Exploring IBM Cognos Business Insight User Interface In this recipe we will explore IBM Cognos Business Insight User Interface in more detail. We will explore various areas of the UI, each dedicated to perform different actions. Getting ready As stated earlier, we will be exploring different sections of Cognos Business Insight. Hence, make sure that IBM Cognos 10 BI installation is open and samples are set up properly. We will start the recipe assuming that the IBM Cognos Connection window is already open on the screen. How to do it... To explore IBM Cognos Business Insight User Interface, perform the following steps: In the IBM Cognos Connection window, navigate to Business Insight Samples, as shown in the following screenshot: Click on one of the dashboards, for instance Marketing Dashboard to open the dashboard in Business Insight. Different areas are labeled, as shown in the following figure: The overall layout is termed as Dashboard. The topmost toolbar is called Application bar . The Application bar contains different icons to manage the dashboard as a whole. For instance, we can create, open, e-mail, share, or save the dashboard using one of the icons on the Application bar. The user can explore different icons on the Application bar by hovering the mouse pointer over them. Hovering displays the tooltip, which has a brief but self-explanatory help text. Similarly, it has a Widget toolbar for every widget, which gets activated when the user clicks on the corresponding widget. When the mouse is focused away from the widget, the Widget toolbar disappears. It has various options, for instance to refresh the widget data, print as PDF, resize to ? t content, and so on. It also provides the user with the capability to change the chart type as well as to change the color palette. However, all these options have help text associated with them, which is activated on mouse hover. Content tab and Content pane show the list of objects available on the Cognos Connection. Directory structure on Cognos Connection can be navigated using Content pane and Content tab, and hence, available objects can be added to or removed from the dashboard. The drag-and-drop functionality has been provided as a result of which creating and editing a dashboard has become as simple as moving objects between the Dashboard area and Cognos Connection. The Toolbox tab displays additional widgets. The Slider Filter and Select Value Filter widgets allow the user to filter report content. The other toolbox widgets allow user to add more report content to the dashboard, such as HTML content, images, RSS feeds, and rich text. How it works... In the preceding section, we have seen basic areas of Business Insight. More than one user can log on to the IBM Cognos 10 BI server, and create various objects on Cognos Connection. These objects include packages, reports, cubes, templates, and statistics to name a few. These objects can be created using one or more tools available to users. For instance, reports can be created using one of the studios available. Cubes can be created using IBM Cognos TM1 or IBM Cognos Transformer and published on Cognos Connection. Metadata model packages can be created using IBM Cognos Framework Manager and published on Cognos Connection. These objects can then be dragged, dropped, and formatted as standalone objects in Cognos Business Insight, and hence, dashboards can be created.
Read more
  • 0
  • 0
  • 2196

article-image-article-ibm-cognos-10-bi-dashboard-business-insight
Packt
16 Jul 2012
7 min read
Save for later

IBM Cognos 10 Business Intelligencea

Packt
16 Jul 2012
7 min read
Introducing IBM Cognos 10 BI Cognos Connection In this recipe we will be exploring Cognos Connection, which is the user interface presented to the user when he/she logs in to IBM Cognos 10 BI for the first time. IBM Cognos 10 BI, once installed and configured, can be accessed through the Web using supported web browsers. For a list of supported web browsers, refer to the Installation and Configuration Guide shipped with the product. Getting ready As stated earlier, make sure that IBM Cognos 10 BI is installed and configured. Install and configure the GO Sales and GO Data Warehouse samples. Use the gateway URI to log on to the web interface called Cognos Connection. How to do it... To explore Cognos Connection, perform the following steps: Log on to Cognos Connection using the gateway URI that may be similar to http://<HostName>:<PortNumber>/ibmcognos/cgi-bin/cognos.cgi. Take note of the Cognos Connection interface. It has the GO Sales and GO Data Warehouse samples visible. Note the blue-colored folder icon, shown as in the preceding screenshot. It represents metadata model packages that are published to Cognos Connection using the Cognos Framework Manager tool. These packages have objects that represent business data objects, relationships, and calculations, which can be used to author reports and dashboards. Refer to the book, IBM Cognos TM1 Cookbook by Packt Publishing to learn how to create metadata models packages. From the toolbar, click on Launch. This will open a menu, showing different studios, each having different functionality, as shown in the following screenshot: We will use Business Insight and Business Insight Advanced, which are the first two choices in the preceding menu. These are the two components used to create and view dashboards. For other options, refer to the corresponding books by the same publisher. For instance, refer to the book, IBM Cognos 8 Report Studio Cookbook to know more about creating and distributing complex reports. Query Studio and Analysis Studio are meant to provide business users with the facility to slice and dice business data themselves. Event Studio is meant to define business situations and corresponding actions. Coming back to Cognos Connection, note that a yellow-colored folder icon, which is shown as represents a user-defined folder, which may or may not contain other published metadata model packages, reports, dashboards, and other content. In our case, we have a user-defined folder called Samples. This was created when we installed and configured samples shipped with the product. Click on the New Folder icon, which is represented by , on the toolbar to create a user-defined folder. Other options are also visible here, for instance to create a new dashboard.   Click on the user-defined folder—Samples to view its contents, as shown in the following screenshot: As shown in the preceding screenshot, it has more such folders, each having its own content. The top part of the pane shows the navigation path. Let's navigate deeper into Models | Business Insight Samples to show some sample dashboards, created using IBM Cognos Business Insight, as shown in the following screenshot: Click on one of these links to view the corresponding dashboard. For instance, click on Sales Dashboard (Interactive) to view the dashboard, as shown in the following screenshot: The dashboard can also be opened in the authoring tool, which is IBM Cognos Business Insight, in this case by clicking on the icon shown as on extreme right, on Cognos Connection. It will show the same result as shown in the preceding screenshot. We will see the Business Insight interface in detail later in this article. How it works... Cognos Connection is the primary user interface that user sees when he/she logs in for the first time. Business data has to be first identified and imported from the metadata model using the Cognos Framework Manager tool. Relationships (inner/outer joins) and calculations are then created, and the resultant metadata model package is published to the IBM Cognos 10 BI Server. This becomes available on Cognos Connection. Users are given access to appropriate studios on Cognos Connection, according to their needs. Analysis, reports, and dashboards are then created and distributed using one of these studios. The preceding sample has used Business Insight, for instance. Later sections in this article will look more into Business Insight and Business Insight Advanced. The next section focuses on the Business Insight interface details from the navigation perspective. Exploring IBM Cognos Business Insight User Interface In this recipe we will explore IBM Cognos Business Insight User Interface in more detail. We will explore various areas of the UI, each dedicated to perform different actions. Getting ready As stated earlier, we will be exploring different sections of Cognos Business Insight. Hence, make sure that IBM Cognos 10 BI installation is open and samples are set up properly. We will start the recipe assuming that the IBM Cognos Connection window is already open on the screen. How to do it... To explore IBM Cognos Business Insight User Interface, perform the following steps: In the IBM Cognos Connection window, navigate to Business Insight Samples, as shown in the following screenshot: Click on one of the dashboards, for instance Marketing Dashboard to open the dashboard in Business Insight. Different areas are labeled, as shown in the following figure: The overall layout is termed as Dashboard. The topmost toolbar is called Application bar . The Application bar contains different icons to manage the dashboard as a whole. For instance, we can create, open, e-mail, share, or save the dashboard using one of the icons on the Application bar. The user can explore different icons on the Application bar by hovering the mouse pointer over them. Hovering displays the tooltip, which has a brief but self-explanatory help text. Similarly, it has a Widget toolbar for every widget, which gets activated when the user clicks on the corresponding widget. When the mouse is focused away from the widget, the Widget toolbar disappears. It has various options, for instance to refresh the widget data, print as PDF, resize to ? t content, and so on. It also provides the user with the capability to change the chart type as well as to change the color palette. However, all these options have help text associated with them, which is activated on mouse hover. Content tab and Content pane show the list of objects available on the Cognos Connection. Directory structure on Cognos Connection can be navigated using Content pane and Content tab, and hence, available objects can be added to or removed from the dashboard. The drag-and-drop functionality has been provided as a result of which creating and editing a dashboard has become as simple as moving objects between the Dashboard area and Cognos Connection. The Toolbox tab displays additional widgets. The Slider Filter and Select Value Filter widgets allow the user to filter report content. The other toolbox widgets allow user to add more report content to the dashboard, such as HTML content, images, RSS feeds, and rich text. How it works... In the preceding section, we have seen basic areas of Business Insight. More than one user can log on to the IBM Cognos 10 BI server, and create various objects on Cognos Connection. These objects include packages, reports, cubes, templates, and statistics to name a few. These objects can be created using one or more tools available to users. For instance, reports can be created using one of the studios available. Cubes can be created using IBM Cognos TM1 or IBM Cognos Transformer and published on Cognos Connection. Metadata model packages can be created using IBM Cognos Framework Manager and published on Cognos Connection. These objects can then be dragged, dropped, and formatted as standalone objects in Cognos Business Insight, and hence, dashboards can be created.
Read more
  • 0
  • 0
  • 895
article-image-article-ibm-cognos-business-intelligence-dashboard-business-insight-advanced
Packt
16 Jul 2012
4 min read
Save for later

IBM Cognos 10 Business Intelligence

Packt
16 Jul 2012
4 min read
Introducing IBM Cognos 10 BI Cognos Connection In this recipe we will be exploring Cognos Connection, which is the user interface presented to the user when he/she logs in to IBM Cognos 10 BI for the first time. IBM Cognos 10 BI, once installed and configured, can be accessed through the Web using supported web browsers. For a list of supported web browsers, refer to the Installation and Configuration Guide shipped with the product. Getting ready As stated earlier, make sure that IBM Cognos 10 BI is installed and configured. Install and configure the GO Sales and GO Data Warehouse samples. Use the gateway URI to log on to the web interface called Cognos Connection. How to do it... To explore Cognos Connection, perform the following steps: Log on to Cognos Connection using the gateway URI that may be similar to http://<HostName>:<PortNumber>/ibmcognos/cgi-bin/cognos.cgi. Take note of the Cognos Connection interface. It has the GO Sales and GO Data Warehouse samples visible. Note the blue-colored folder icon, shown as in the preceding screenshot. It represents metadata model packages that are published to Cognos Connection using the Cognos Framework Manager tool. These packages have objects that represent business data objects, relationships, and calculations, which can be used to author reports and dashboards. Refer to the book, IBM Cognos TM1 Cookbook by Packt Publishing to learn how to create metadata models packages. From the toolbar, click on Launch. This will open a menu, showing different studios, each having different functionality, as shown in the following screenshot: We will use Business Insight and Business Insight Advanced, which are the first two choices in the preceding menu. These are the two components used to create and view dashboards. For other options, refer to the corresponding books by the same publisher. For instance, refer to the book, IBM Cognos 8 Report Studio Cookbook to know more about creating and distributing complex reports. Query Studio and Analysis Studio are meant to provide business users with the facility to slice and dice business data themselves. Event Studio is meant to define business situations and corresponding actions. Coming back to Cognos Connection, note that a yellow-colored folder icon, which is shown as represents a user-defined folder, which may or may not contain other published metadata model packages, reports, dashboards, and other content. In our case, we have a user-defined folder called Samples. This was created when we installed and configured samples shipped with the product. Click on the New Folder icon, which is represented by , on the toolbar to create a user-defined folder. Other options are also visible here, for instance to create a new dashboard.   Click on the user-defined folder—Samples to view its contents, as shown in the following screenshot: As shown in the preceding screenshot, it has more such folders, each having its own content. The top part of the pane shows the navigation path. Let's navigate deeper into Models | Business Insight Samples to show some sample dashboards, created using IBM Cognos Business Insight, as shown in the following screenshot: Click on one of these links to view the corresponding dashboard. For instance, click on Sales Dashboard (Interactive) to view the dashboard, as shown in the following screenshot: The dashboard can also be opened in the authoring tool, which is IBM Cognos Business Insight, in this case by clicking on the icon shown as on extreme right, on Cognos Connection. It will show the same result as shown in the preceding screenshot. We will see the Business Insight interface in detail later in this article. How it works... Cognos Connection is the primary user interface that user sees when he/she logs in for the first time. Business data has to be first identified and imported from the metadata model using the Cognos Framework Manager tool. Relationships (inner/outer joins) and calculations are then created, and the resultant metadata model package is published to the IBM Cognos 10 BI Server. This becomes available on Cognos Connection. Users are given access to appropriate studios on Cognos Connection, according to their needs. Analysis, reports, and dashboards are then created and distributed using one of these studios. The preceding sample has used Business Insight, for instance. Later sections in this article will look more into Business Insight and Business Insight Advanced. The next section focuses on the Business Insight interface details from the navigation perspective.
Read more
  • 0
  • 0
  • 925

article-image-android-database-programming-binding-ui
Packt
07 Jun 2012
6 min read
Save for later

Android Database Programming: Binding to the UI

Packt
07 Jun 2012
6 min read
(For more resources on Android, see here.) SimpleCursorAdapters and ListViews There are two major ways of retrieving data on Android, and each has its own class of ListAdapters, which will then know how to handle and bind the passed-in data. The first way of retrieving data is one that we're very familiar with already – through making queries and obtaining Cursor objects. The subclass of ListAdapters that wrap around Cursors is called CursorAdapter, and in the next section we'll focus on the SimpleCursorAdapter, which is the most straightforward instance of CursorAdapter. The Cursor points to a subtable of rows containing the results of our query. By iterating through this cursor, we are able to examine the fields of each row. Now we would like to convert each row of the subtable into a corresponding row in our list. The first step in doing this is to set up a ListActivity (a variant of the more common Activity class). As its name suggests, a ListActivity is simply a subclass of the Activity class which comes with methods that allow you to attach ListAdapters. The ListActivity class also allows you to inflate XML layouts, which contain list tags. In our example, we will use a very bare-bones XML layout (named list.xml) that only contains a ListView tag as follows: <?xml version="1.0" encoding="utf-8"?><LinearLayout android_orientation="vertical" android_layout_width="fill_parent" android_layout_height="wrap_content" > <ListView android_id="@android:id/android:list" android_layout_width="fill_parent" android_layout_height="wrap_content" /></LinearLayout> This is the first step in setting up what's called a ListView in Android. Similar to how defining a TextView allows you to see a block of text in your Activity, defining a ListView will allow you to interact with a scrollable list of row objects in your Activity. Intuitively, the next question in your mind should be: Where do I define how each row actually looks? Not only do you need to define the actual list object somewhere, but each row should have its own layout as well. So, to do this we create a separate list_entry.xml file in our layouts directory. The example I'm about to use is the one that queries the Contacts content provider and returns a list containing each contact's name, phone number, and phone number type. Thus, each row of my list should contain three TextViews, one for each data field. Subsequently, my list_entry.xml file looks like the following: <?xml version="1.0" encoding="utf-8"?><LinearLayout android_orientation="vertical" android_layout_width="fill_parent" android_layout_height="wrap_content" android_padding="10dip" > <TextView android_id="@+id/name_entry" android_layout_width="wrap_content" android_layout_height="wrap_content" android_textSize="28dip" /> <TextView android_id="@+id/number_entry" android_layout_width="wrap_content" android_layout_height="wrap_content" android_textSize="16dip" /> <TextView android_id="@+id/number_type_entry" android_layout_width="wrap_content" android_layout_height="wrap_content" android_textColor="#DDD" android_textSize="14dip" /></LinearLayout> So we have a vertical LinearLayout which contains three TextViews, each with its own properly defined ID as well as its own aesthetic properties (that is, text size and text color). In terms of set up – this is all we need! Now we just need to create the ListActivity itself, inflate the list.xml layout, and specify the adapter. To see how all this is done, let's take a look at the code before breaking it apart piece by piece: public class SimpleContactsActivity extends ListActivity { @Override public void onCreate(Bundle savedInstanceState) { super.onCreate(savedInstanceState); setContentView(R.layout.list); // MAKE QUERY TO CONTACT CONTENTPROVIDER String[] projections = new String[] { Phone._ID, Phone.DISPLAY_NAME, Phone.NUMBER, Phone.TYPE }; Cursor c = getContentResolver().query(Phone.CONTENT_URI, projections, null, null, null); startManagingCursor(c); // THE DESIRED COLUMNS TO BE BOUND String[] columns = new String[] { Phone.DISPLAY_NAME, Phone.NUMBER, Phone.TYPE }; // THE XML DEFINED VIEWS FOR EACH FIELD TO BE BOUND TO int[] to = new int[] { R.id.name_entry, R.id.number_entry, R.id.number_type_entry }; // CREATE ADAPTER WITH CURSOR POINTING TO DESIRED DATA SimpleCursorAdapter cAdapter = new SimpleCursorAdapter(this, R.layout.list_entry, c, columns, to); // SET THIS ADAPTER AS YOUR LIST ACTIVITY'S ADAPTER this.setListAdapter(cAdapter); }} So what's going on here? Well, the first part of the code you should recognize by now – we're simply making a query over the phone's contact list (specifically over the Contact content provider's Phone table) and asking for the contact's name, number, and number type. Next, the SimpleCursorAdapter takes as two of its parameters, a string array and an integer array which represent a mapping between Cursor columns and XML layout views. In our case, this is as follows: // THE DESIRED COLUMNS TO BE BOUNDString[] columns = new String[] { Phone.DISPLAY_NAME, Phone.NUMBER, Phone.TYPE };// THE XML DEFINED VIEWS FOR EACH FIELD TO BE BOUND TOint[] to = new int[] { R.id.name_entry, R.id.number_entry, R.id.number_type_entry }; This is so that the data in the DISPLAY_NAME column will get bound to the TextView with ID name_entry, and so on. Once we have these mappings defined, the next part is to just instantiate the SimpleCursorAdapter, which can be seen in this line: // CREATE ADAPTER WITH CURSOR POINTING TO DESIRED DATASimpleCursorAdapter cAdapter = new SimpleCursorAdapter(this, R.layout.list_entry, c, columns, to); Now, the SimpleCursorAdapter takes five parameters – the first is the Context, which essentially tells the CursorAdapter which parent Activity it needs to inflate and bind the rows to. The next parameter is the ID of the R layout that you defined earlier, which will tell the CursorAdapter what each row should look like and, furthermore, where it can inflate the corresponding Views. Next, we pass in the Cursor, which tells the adapter what the underlying data actually is, and lastly, we pass in the mappings. Hopefully, the previous code makes sense, and the parameters of SimpleCursorAdapter make sense as well. The result of this previous Activity can be seen in the following screenshot: Everything looks good, except for these random integers floating around under the phone number. Why are there a bunch of 1s, 2s, and 3s at the bottom of each row where the types should be? Well, the phone number types are not returned as Strings but are instead returned as integers. From there through a simple switch statement, we can easily convert these integers into more descriptive Strings. However, you'll quickly see that with our very simple, straightforward use of the built-in SimpleCursorAdapter class, there was nowhere for us to implement any "special" logic that would allow us to convert such returned integers to Strings. This is when overriding the SimpleCursorAdapter class becomes necessary, because only then can we have full control over how the Cursor's data is to be displayed in each row. And so, we move onwards to the next section where we see just that.
Read more
  • 0
  • 0
  • 1288