Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

How-To Tutorials - Data

1204 Articles
article-image-debugging-mechanisms-oracle-sql-developer-21-using-plsql
Packt
08 Sep 2010
7 min read
Save for later

Debugging mechanisms in Oracle SQL Developer 2.1 using Pl/SQL

Packt
08 Sep 2010
7 min read
Once your PL/SQL code has successfully compiled, it is important to review it to make sure it does what is required and that it performs well. You can consider a number of approaches when tuning and testing code. These approaches include: Debugging—run the code and add break points to stop and inspect areas of concern. SQL performance—use Explain Plan results to review the performance. PL/SQL performance—use the PL/SQL Hierarchical Profiler to identify bottlenecks. Unit testing—review edge cases and general function testing. Does the code do what you intended it to do? In this article by Sue Harper, author of Oracle SQL Developer 2.1, we'll review the debugger. We will see how to debug PL/SQL packages, procedures, and functions. Debugging PL/SQL code SQL and PL/SQL code may execute cleanly, and even produce an output. PL/SQL code may compile and produce results, but this is part of the task. Does it do what you are expecting it to do? Are the results accurate? Does it behave as expected for high and low values, odd dates or names? Does it behave the same way when it's called from within a program as it does when tested in isolation? Does it perform as well for massive sets of data as it does for a small test case? All of these are aspects to consider when testing code, and many can been tracked by debugging the code. Using the debugging mechanism in SQL Developer You will need a piece of compiled, working code. For this exercise, we will use the following piece of code: PROCEDURE EMP_DEPTS(P_MAXROWS VARCHAR2)ASCURSOR EMPDEPT_CURSOR ISSELECT D.DEPARTMENT_NAME, E.LAST_NAME, J.JOB_TITLEFROM DEPARTMENTS D, EMPLOYEES E, JOBS JWHERE D.DEPARTMENT_ID = E.DEPARTMENT_IDAND E.JOB_ID = J.JOB_ID;EMP_RECORD EMPDEPT_CURSOR % ROWTYPE;TYPE EMP_TAB_TYPE IS TABLE OF EMPDEPT_CURSOR % ROWTYPE INDEX BYBINARY_INTEGER;EMP_TAB EMP_TAB_TYPE;I NUMBER := 1;BEGINOPEN EMPDEPT_CURSOR;FETCH EMPDEPT_CURSORINTO EMP_RECORD;EMP_TAB(I) := EMP_RECORD;WHILE((EMPDEPT_CURSOR % FOUND) AND(I <= P_MAXROWS))LOOP I := I + 1;FETCH EMPDEPT_CURSORINTO EMP_RECORD;EMP_TAB(I) := EMP_RECORD;END LOOP;CLOSE EMPDEPT_CURSOR; FOR J IN REVERSE 1 .. ILOOP DBMS_OUTPUT.PUT_LINE('THE EMPLOYEE '|| EMP_TAB(J).LAST_NAME ||' WORKS IN DEPARTMENT '|| EMP_TAB(J).DEPARTMENT_NAME);END LOOP;END; Before you can debug code, you need to have the following privileges: EXECUTE and DEBUG—you need to be able to execute the required procedure DEBUG CONNECT SESSION—to be able to debug procedures you execute in the session Note, when granting the system privilege DEBUG ANY PROCEDURE, you are granting access to debug any procedure that you have execute privilege for and has been compiled for debug. Using the Oracle debugging packages Oracle provides two packages for debugging PL/SQL code. The first, DBMS_DEBUG, was introduced in Oracle 8i and is not used by newer IDEs. The second, DBMS_DEBUG_JWP, was introduced in Oracle 9i Release 2, and is used in SQL Developer when debugging sub-programs. Debugging When preparing to debug any code, you need to set at least one break point, and then you should select Compile for Debug. In the following screenshot, the breakpoint is set at the opening of the cursor, and the Compile for Debug option is shown in the drop-down list: Instead of using the drop-down list to select the Compile or Compile for Debug options, just click the Compile button. This compiles the PL/SQL code using the optimization level set in the Preferences. Select Database PL/SQL Compiler|. By setting the Optimization Level preference to 0 or 1 the PL/SQL is compiled with debugging information. Any PL/SQL code that has been compiled for debugging will show the little green bug overlaying the regular icon in the Connections navigator. The next screenshot shows the EMP_DEPTS procedure and the GET_SALARY function have both been compiled for debug: Compile for debug Once you have completed a debugging session, be sure to compile again afterwards to remove any debug compiler directives. While negligible, omitting this step can have a performance impact on the PL/SQL program. You are now ready to debug. To debug, click the Debug button in the toolbar. SQL Developer then sets the sessions to a debug session and issues the command DBMS_DEBUG_JDWP.CONNECT_TCP (hostname, port); and sets up the debug windows as shown in the following screenshot: This connects you to a debugger session in the database. In some instances, the port selected is not open, due to firewall or other restrictions. In this case, you can have SQL Developer prompt you for the port. To set this option, open the Preferences dialog, and select the Debugger node. You can also specify the port range available for SQL Developer to use. These options mean that you can have more control over the ports used. Navigating through the code The PL/SQL debugger provides a selection of buttons (or menu items) to step through individual lines of code, or to step over blocks of code. You can step through or over procedures, navigating to the point of contention or the area you wish to inspect. Once you start stepping into the code, you can track the data as it changes. The data is displayed in a second set of tabbed dialogs. In this example, we are looping through a set of records in order for you to see how each of the windows behaves. As you start stepping into the code, the Data tab starts to display the values: This Data tab continues to collect all of the variables as you continue to step through the code. Even if you step over and skip blocks of code, all of the code is executed and the results are gathered here. The Smart Data tab keeps track of the same detail, but only the values immediately related to the area you are working in. This is more useful in a large procedure than in a small one like the example shown. The context menu provides you with a set of options while debugging. These include: Run to Cursor—allows you to start debugging and then to quickly move to another part of the code. The code in-between is quickly executed and you can continue debugging. Watch—allows you to watch an expression or code while you are debugging. Inspect—allows you to watch values as you debug. In the following screenshot, the current execution point is at the start of the WHILE loop. If the loop is required to loop multiple times, you can skip that and have the code execute to a point further down in the code, in this case after the cursor has been completed and closed: The Watch and Inspect options remain set up if you stop and restart the debug session. This allows you to stop, change the input values, and start debugging and these will change according to the new parameters. You do not need to set up watch or inspector values each time you debug the procedure. The values appear in dockable windows, so you can dock or float them near the code as required: You can modify values that you are watching. In the following example, 'i' is the counter that we're using in the loop. You can modify this value to skip over chunks of the loop, and then continue from a particular point. Modifying values in the middle of the code can be useful, as you might want to test how the program reacts in certain circumstances. For example, before the millennium, testers may have wanted to see how code behaved, or output changed once the date switched over to the year 2000.
Read more
  • 0
  • 1
  • 8085

article-image-examples-mysql-daemon-plugin
Packt
01 Sep 2010
8 min read
Save for later

Examples of MySQL Daemon Plugin

Packt
01 Sep 2010
8 min read
(For more resources on MySQL, see here.) A Hello World! Daemon plugin Now, let's look at our first complete plugin example. This plugin is probably the most basic plugin we can have. It simply prints a message into the MySQL error log when loaded: #include <stdio.h> #include <mysql/plugin.h> #include <mysql_version.h> These are the basic includes required for most Daemon plugins. The most important being mysql/plugin.h, which contains macros and data structures necessary for a MySQL plugin. static int hello_world_plugin_init(void *p) { fprintf(stderr, "Hello World: " "This is a static text daemon example plugin!n"); return 0; } In the plugin initialization function we simply write a message to stderr. MySQL redirects stderr to the error log (if there is one) so our message will end up there. We then return 0 to indicate that the initialization was successful. struct st_mysql_daemon hello_world_info = { MYSQL_DAEMON_INTERFACE_VERSION }; This structure is used for the info part of the plugin declaration. In Daemon plugins it simply contains the API version that this plugin was compiled against. The Daemon plugin API version matches the MySQL server version, which means MySQL Daemon plugins can only be used with a MySQL server version they have been compiled against. Indeed, for a Daemon plugin to do something non-trivial it will invariably need access to the server's internal functions and data structures that change with every MySQL version. Other plugins that are implemented according to a certain functionality API are separated from the server internals and are binary compatible with a wide range of server releases. Having defined all of the functions and auxiliary structures, we can declare a plugin: mysql_declare_plugin(hello_world) { This is a Daemon plugin so we need to specify it as such with this defined constant: MYSQL_DAEMON_PLUGIN, info points to the structure declared earlier. With other plugin types this may contain additional information valuable to the plugin functionality: &hello_world_info, We are calling this plugin "hello_world". This is its name for the INSTALL PLUGIN command and any plugin status: "hello_world", The author string, is useful for providing contact information about the author of the plugin: "Andrew Hutchings (<a href="[email protected]" target="_blank">[email protected])", A Simple line of text that gives a basic description of what our plugin does: "Daemon hello world example, outputs some static text", This plugin is licensed under GPL so we set the license type to this: PLUGIN_LICENSE_GPL, This is our initialization function that has been defined earlier in the code: hello_world_plugin_init, As our simple plugin does not need a de-initialization function, we put NULL here: NULL, This plugin is given version 1.0 because it is our first GA release of the plugin. In future versions we can increment this: 0x0100, There are no status or system variables in this example. Hence, everything below the version is set to NULL: NULL, NULL, NULL } mysql_declare_plugin_end; We can now install this plugin using the INSTALL PLUGIN syntax Welcome to the MySQL monitor. Commands end with ; or g. Your MySQL connection id is 2 Server version: 5.1.47 Source distribution Type 'help;' or 'h' for help. Type 'c' to clear the current input statement. mysql> INSTALL PLUGIN hello_world SONAME 'hello_world.so'; Query OK, 0 rows affected (0.00 sec) Going to the error log we see: 090801 22:18:00 [Note] /home/linuxjedi/Programming/Builds/mysql-5.1.47/ libexec/mysqld: ready for connections. Version: '5.1.47' socket: '/tmp/mysql.sock' port: 3306 Source distribution Hello World: This is a static text daemon example plugin! A system and status variables demo plugin Let's see a more complex example. This plugin shows how to create system and status variables. It has one global system variable and one status variable, both defined as long long. When you set the global system variable, its value is copied into the status variable. #include <stdio.h> #include <mysql/plugin.h> #include <mysql_version.h> long long system_var = 0; long long status_var = 0; struct st_mysql_show_var vars_status_var[] = { {"vars_status_var", (char *) &status_var, SHOW_LONGLONG}, {0, 0, 0} }; We have one status variable in this plugin called vars_status_var which is bound to the status_var variable defined near the top of this source code. We are defining this variable as long long so we use the SHOW_LONGLONG type. int sysvar_check(MYSQL_THD thd, struct st_mysql_sys_var *var, void *save, struct st_mysql_value *value) { This function is to be called before our system variable is updated. A plugin is not required to provide it but it can be used to check if the data entered is valid and, as an example, we will only allow values that are not too close to status_var. long long buf; value->val_int(value, &buf); First we retrieve the new value-to-be and store it in buf. *(longlong*) save = buf; We then set save to the contents of buf, so that the update function could access it and store the value in our system_var variable. If we do not implement our own sysvar_check() function for our system variable, MySQL will provide a default one that performs all of the above (but nothing of the following). if (buf * 2 < status_var || buf > status_var * 3) return 0; else return 1; } This is our special condition. In this example we allow an update only if the new value is either less than a half of or three times bigger than the value of status_var. We return 0 when the new value is valid, and an update should be allowed, and 1 when an update should be canceled. In our update function we copy the value of the system_var to a status_var, to see how its value changes in SHOW STATUS and to get a different range on valid values for the system_var on every update. Note that the update function cannot return a value. It is not supposed to fail! void sysvar_update(MYSQL_THD thd, struct st_mysql_sys_var *var, void *var_ptr, const void *save) { system_var = *(long long *)save; status_var = system_var; } We update our system_var variable without any mutex protection, even though many threads may try to execute the SET statement at the same time. Nevertheless, it is safe. MySQL internally guards all accesses to global system variables with a mutex, which means we do not have to. MYSQL_SYSVAR_LONGLONG(vars_system, system_var, 0, "A demo system var", sysvar_check, sysvar_update, 0, 0, 123456789, 0); This is the declaration for our system variable. It is a long long and is called vars_system. In fact as this is a variable for the vars plugin, the full name will be vars_vars_system in SHOW VARIABLES. It is associated with the system_var variable in the code, has the check function sysvar_check() and an update function sysvar_update() as defined above, and it can only take values between 0 and 123456789. struct st_mysql_sys_var* vars_system_var[] = { MYSQL_SYSVAR(vars_system), NULL }; This is the structure which stores all system variables to be passed to the declaration for this plugin. As we only have one variable we shall only include that. struct st_mysql_daemon vars_plugin_info= { MYSQL_DAEMON_INTERFACE_VERSION }; mysql_declare_plugin(vars) { MYSQL_DAEMON_PLUGIN, &vars_plugin_info, "vars", "Andrew Hutchings", "A system and status variables example", PLUGIN_LICENSE_GPL, NULL, NULL, 0x0100, vars_status_var, vars_system_var, NULL } mysql_declare_plugin_end; This is very similar to the declaration of our first plugin, but this one has structures for the status variables and system variable listed. When putting our new plugin into action we should see the following: mysql> INSTALL PLUGIN vars SONAME 'vars.so'; Query OK, 0 rows affected (0.00 sec) mysql> SHOW STATUS LIKE 'vars_%'; +-----------------+-------+ | Variable_name | Value | +-----------------+-------+ | vars_status_var | 0 | +-----------------+-------+ 1 row in set (0.00 sec) mysql> SHOW VARIABLES LIKE 'vars_%'; +------------------+-------+ | Variable_name | Value | +------------------+-------+ | vars_vars_system | 0 | +------------------+-------+ 1 row in set (0.00 sec) Our status and system variables are both set to 0 by default. mysql> SET GLOBAL vars_vars_system=2384; Query OK, 0 rows affected (0.00 sec) mysql> SHOW STATUS LIKE 'vars_%'; +-----------------+-------+ | Variable_name | Value | +-----------------+-------+ | vars_status_var | 2384 | +-----------------+-------+ 1 row in set (0.00 sec) mysql> SHOW VARIABLES LIKE 'vars_%'; +------------------+-------+ | Variable_name | Value | +------------------+-------+ | vars_vars_system | 2384 | +------------------+-------+ 1 row in set (0.00 sec) Setting our system variable to 2384 has altered both the system variable and the status variable, so we have success! mysql> SET GLOBAL vars_vars_system=2383; ERROR 1210 (HY000): Incorrect arguments to SET Our special check function works too. The variable cannot be updated to a value that is too close to its old value!
Read more
  • 0
  • 0
  • 2138

article-image-look-inside-mysql-daemon-plugin
Packt
31 Aug 2010
11 min read
Save for later

A Look Inside a MySQL Daemon Plugin

Packt
31 Aug 2010
11 min read
A look inside a Daemon plugin Unlike UDFs, MySQL plugins store all of the metadata in the plugins shared library. So when installing a plugin you only need to specify the name of the plugin and its shared library filename. This eliminates much of the user error while installing. With UDFs it is very easy to choose the wrong return type or forget the AGGREGATE keyword, but with plugins this is not possible. Why write a Daemon plugin Just like UDFs and other MySQL plugin types the Daemon plugin can be used to add extra functionality to MySQL with the same advantages and disadvantages. Daemon plugins are ideal for writing code that needs to reside in the server but does not need to communicate with it—such as a heartbeat plugin or monitoring plugins—because the simple Daemon plugin API does not provide any means for a server and a plugin to communicate with each other. Installing and using Daemon plugins Installing plugins is relatively easy because all of the information about a plugin is stored inside it. To install a plugin we can use the INSTALL PLUGIN statement as follows: mysql> INSTALL PLUGIN my_plugin SONAME 'my_plugin.so'; Likewise, to remove a plugin we use: mysql> UNINSTALL PLUGIN my_plugin; When a plugin is installed it is initialized instantly and this means that the code we write will start automatically when our plugin is installed. Upon installing a plugin it is added to the mysql.plugin table so MySQL knows it is installed and can load it again on startup. In other words, similar to UDFs, all installed plugins are loaded automatically when a server is started. A plugin is de-initialized when either it is uninstalled or the MySQL server is being shut down. It is worth noting at this time that if the MySQL server crashes for any reason the de-initialization of the plugin will not happen. If a plugin is installed, we can prevent it from being loaded and executed at startup with the --disable-plugin-my-plugin or --plugin-my-plugin=OFF commands. If we do not do that MySQL will try to load it because the default behavior is --plugin-my-plugin=ON. If the plugin fails to load, MySQL will note that fact in the error log and will continue without this plugin. If we want to be sure that a plugin is absolutely loaded in the server, and that the server will not simply ignore a plugin failure, we can use --plugin-my-plugin=FORCE. In this mode the server will exit if our plugin fails to load. As we can see below, the mysql.plugin table simply contains the plugin name and the filename for the shared library containing the plugin: mysql> SELECT * FROM mysql.plugin;+-----------+--------------+| name | dl |+-----------+--------------+| my_plugin | my_plugin.so |+-----------+--------------+1 row in set (0.01 sec) MySQL has a SHOW command to give us information about installed plugins. This is very useful to see if a plugin is actually running. If there was a problem during initialization then the status for the plugin will be marked as DISABLED. A sample output for SHOW PLUGINS can be seen below: mysql> SHOW PLUGINSG....*************************** 11. row *************************** Name: my_plugin Status: ACTIVE Type: DAEMONLibrary: my_plugin.soLicense: GPL11 rows in set (0.00 sec) Information Schema also includes a table for use with plugins, and it contains more detail than SHOW PLUGINS. It shows version information supplied by the plugin as well as the plugin description: mysql> SELECT * FROM information_schema.plugins WHERE PLUGIN_NAME='my_plugin'G*************************** 1. row *************************** PLUGIN_NAME: my_plugin PLUGIN_VERSION: 1.0 PLUGIN_STATUS: ACTIVE PLUGIN_TYPE: DAEMON PLUGIN_TYPE_VERSION: 50147.0PLUGIN_LIBRARY: my_plugin.so PLUGIN_LIBRARY_VERSION: 1.0 PLUGIN_AUTHOR: Andrew Hutchings PLUGIN_DESCRIPTION: Daemon example, shows a declaration PLUGIN_LICENSE: GPL1 row in set (0.00 sec) Technically, loading of plugins is very similar to loading of UDFs. Problems that can arise, ways of solving them, and error messages are similar to those of UDFs. The role of a version As we have seen, there are three two-component version numbers in the INFORMATION_SCHEMA.PLUGINS table. One of them, PLUGIN_VERSION, is purely informational. It is a number that a plugin author can specify arbitrarily, and MySQL itself does not do anything with it. The other two are very important though. They are used to protect the API, to make sure that if a plugin is loaded it uses the same API version that the server provides. This is one of the main differences to UDFs. UDF API is not versioned. Hence, it was not developed and still has only those features that it did in 3.21.24. Extending UDF API is risky; any change and old UDFs may start crashing the server. Plugin API, on the other hand, is safe. It is protected by a version, and this version is part of every plugin library, the API version that the plugin was compiled against. When a plugin is loaded the server verifies that the version is supported by the server and refuses to load a plugin otherwise. That is, the server can guarantee that all loaded plugins are fully compatible with the server, and no crash can happen because of API mismatch. The API is protected with two version numbers, as it contains two parts—one is common to all plugin types. It is version 1.0, as can be seen in the PLUGIN_LIBRARY_ VERSION column above. The other one is specific to each plugin type. For Daemon plugins this version is 50147.0, as seen in the PLUGIN_TYPE_VERSION column, and it is derived from the MySQL server version (which was 5.1.47 in my examples). Defining Daemon plugins The most basic of Daemon plugins needs no code at all; only a declaration is required. A plugin declaration is an instance of a st_mysql_plugin structure: struct st_mysql_plugin{ int type; void *info; const char *name; const char *author; const char *descr; int license; int (*init)(void *); int (*deinit)(void *); unsigned int version; struct st_mysql_show_var *status_vars; struct st_mysql_sys_var **system_vars; void *__reserved1}; The type defines what type of plugin this will be, which in turn defines what it can do. In MySQL 5.1, type can be set to one of the following enumerated values: Here we are talking about Daemon plugins so this should be set to MYSQL_DAEMON_PLUGIN. The info member is a pointer to the descriptor of the plugin and its members. It contains the information specific to this particular plugin type (while the st_mysql_plugin structure itself contains the information applicable to any plugin, independently of its type). It always starts with an API version number and for Daemon plugins this is all it contains. For other plugin types it may also contain plugin methods that the server will call, but Daemon plugins are not designed to communicate with the server, and their descriptor structure contains nothing besides the version: struct st_mysql_daemon my_daemon_plugin ={ MYSQL_DAEMON_INTERFACE_VERSION }; Next we have the name member, which specifies the name of the plugin as it will be used by MySQL. This is the name that needs to be used in the INSTALL PLUGIN statement, the name that will be seen in SHOW PLUGINS and SHOW ENGINES, the name that all plugin configuration variables and command-line options will start from. That is, the name of the plugin should be a valid SQL identifier and should be good for the command line too. A safe choice would be a name that consists of only Latin letters, digits, and an underscore. Plugin names are not case-sensitive. The author member is a string containing details about the author; it can contain anything we wish. It must be in UTF-8 and can be arbitrarily long, but MySQL will only show the first 64 characters of it. The final string member is descr, which should contain a description of the plugin. Again we are free to put whatever we like here, but we would normally put a short line stating what the plugin does. Again, it is supposed to be UTF-8, but it can be as long as you want. In the next member, each plugin specifies its license. This does not strictly do anything as such, but should help with accidental distribution of a plugin under the wrong license. There are currently three possible values for the license member: Then we come to the init and deinit members, which are pointers to the plugin initialization and de-initialization functions. The initialization function is called when the plugin is loaded during INSTALL PLUGIN or server startup. The de-initialization function is called when a plugin is unloaded, which, again, can happen for two reasons, UNINSTALL PLUGIN or server shutdown. In a Daemon plugin the initialization function is often used to fork a thread to run the main function of the plugin. Both the initialization and the de-initialization functions should return 0 on success or 1 on failure. The version member should be used for the current version of our plugin. A two-component version is encoded as a hexadecimal number, where the lower 8 bits store the second component (minor version) and all others bits store the first component (major version). For example, if the version is set to 0x205, MySQL will show it as "2.5", and if the version is set to 0x123FF, MySQL will show it as "291.255". Unfortunately, there is no way to store in this member a more complex version such as "1.10.14b-RC2". MySQL has many status variables that can be seen with the SHOW STATUS statement, and there are different third-party tools that analyze this data, how the status variables change over time, draw graphs, and so on. A plugin can benefit from that and make its status and various statistics and performance values visible as MySQL status variables. A pointer to the list of the plugin status variables is stored in the status_vars member. Similarly, there is a SHOW VARIABLES statement. It lists all MySQL system variables, variables that are used to alter the behavior of the server. They can have serverwide or session-only effect, some of them can be set on the command line or in the configuration file. They can be modifiable run-time or read-only. This is all available to plugins too. A plugin can add new system variables to the server, global or session, with or without command-line option support, modifiable or read-only. As we would expect, a pointer to the array of these variables goes into the system_vars member. Finally there is one __reserved1 member, which is unused in MySQL 5.1 and should be set to NULL. MySQL provides two macros that help to declare plugins. A plugin declaration starts from the mysql_declare_plugin() macro. It takes one argument, a unique identifier for this plugin library, it will be used automatically as needed to avoid name clashes when plugins are linked statically into the server. This identifier must be the same one that was used as a plugin name in the plug.in file. We can put many plugins in one library, but they all need to be declared in one place, after the mysql_declare_plugin() macro, separated by commas. We end the list of plugin declarations with a mysql_declare_plugin_end macro. A complete example of the plugin declarations can be seen as follows: mysql_declare_plugin(my_plugin){ MYSQL_DAEMON_PLUGIN, &my_plugin_info, "my_plugin", "Andrew Hutchings ([email protected])", "Daemon example, shows a declaration", PLUGIN_LICENSE_GPL, my_plugin_init, my_plugin_deinit, 0x0100, NULL, NULL, NULL},{ MYSQL_DAEMON_PLUGIN, &my_plugin2_info, "my_plugin2", "Sergei Golubchik ([email protected])", "Another Daemon example, shows a declaration", PLUGIN_LICENSE_GPL, my_plugin2_init, NULL, 0xC0FFEE, status, vars, NULL}mysql_declare_plugin_end; This declares two plugins. We can see that the first one: is a Daemon plugin has an info structure called my_plugin_info is called my_plugin and was written by me (Andrew Hutchings) is described as an example plugin is GPL licensed has initialization and de-initialization functions is of version 1.0 has no system or status variables The second plugin can be interpreted similarly. It is also a Daemon plugin of version 49407.238 with initialization function, without de-initialization function, with both status and system variables.
Read more
  • 0
  • 0
  • 1925
Visually different images

Packt
31 Aug 2010
8 min read
Save for later

MySQL 5.1 Plugin: HTML Storage Engine—Reads and Writes

Packt
31 Aug 2010
8 min read
(For more resources on MySQL, see here.) An idea of the HTML engine Ever thought about what your tables might look like? Why not represent a table as a <TABLE>? You would be able to see it, visually, in any browser. Sounds cool. But how could we make it work? We want a simple engine, not an all-purpose Swiss Army Knife HTML-to-SQL converter, which means we will not need any existing universal HTML or XML parsers, but can rely on a fixed file format. For example, something like this: <html><head><title>t1</title></head><body><table border=1><tr><th>col1</th><th>other col</th><th>more cols</th></tr><tr><td>data</td><td>more data</td><td>more data</td></tr><!-- this row was deleted ... --><tr><td>data</td><td>more data</td><td>more data</td></tr>... and so on ...</table></body></html> But even then this engine is way more complex than the previous example, and it makes sense to split the code. The engine could stay, as usual, in the ha_html.cc file, the declarations in ha_html.h, and if we need any utility functions to work with HTML we can put them in the htmlutils.cc file. Flashback A storage engine needs to declare a plugin and an initialization function that fills a handlerton structure. Again, the only handlerton method that we need here is a create() method. #include "ha_html.h"static handler* html_create_handler(handlerton *hton, TABLE_SHARE *table, MEM_ROOT *mem_root){ return new (mem_root) ha_html(hton, table);}static int html_init(void *p){ handlerton *html_hton = (handlerton *)p; html_hton->create = html_create_handler; return 0;}struct st_mysql_storage_engine html_storage_engine ={ MYSQL_HANDLERTON_INTERFACE_VERSION };mysql_declare_plugin(html){ MYSQL_STORAGE_ENGINE_PLUGIN, &html_storage_engine, "HTML", "Sergei Golubchik", "An example HTML storage engine", PLUGIN_LICENSE_GPL, html_init, NULL, 0x0001, NULL, NULL, NULL}mysql_declare_plugin_end; Now we need to implement all of the required handler class methods. Let's start with simple ones: const char *ha_html::table_type() const{ return "HTML";}const char **ha_html::bas_ext() const{ static const char *exts[] = { ".html", 0 }; return exts;}ulong ha_html::index_flags(uint inx, uint part, bool all_parts) const{ return 0;}ulonglong ha_html::table_flags() const{ return HA_NO_TRANSACTIONS | HA_REC_NOT_IN_SEQ | HA_NO_BLOBS;}THR_LOCK_DATA **ha_html::store_lock(THD *thd, THR_LOCK_DATA **to, enum thr_lock_type lock_type){ if (lock_type != TL_IGNORE && lock.type == TL_UNLOCK) lock.type = lock_type; *to ++= &lock; return to;} These methods are familiar to us. They say that the engine is called "HTML", it stores the table data in files with the .html extension, the tables are not transactional, the position for ha_html::rnd_pos() is obtained by calling ha_html::position(), and that it does not support BLOBs. Also, we need a function to create and initialize an HTML_SHARE structure: static HTML_SHARE *find_or_create_share( const char *table_name, TABLE *table){ HTML_SHARE *share; for (share = (HTML_SHARE*)table->s->ha_data; share; share = share->next) if (my_strcasecmp(table_alias_charset, table_name, share->name) == 0) return share; share = (HTML_SHARE*)alloc_root(&table->s->mem_root, sizeof(*share)); bzero(share, sizeof(*share)); share->name = strdup_root(&table->s->mem_root, table_name); share->next = (HTML_SHARE*)table->s->ha_data; table->s->ha_data = share; return share;} It is exactly the same function, only the structure is now called HTML_SHARE, not STATIC_SHARE. Creating, opening, and closing the table Having done the basics, we can start working with the tables. The first operation, of course, is the table creation. To be able to read, update, or even open the table we need to create it first, right? Now, the table is just an HTML file and to create a table we only need to create an HTML file with our header and footer, but with no data between them. We do not need to create any TABLE or Field objects, or anything else—MySQL does it automatically. To avoid repeating the same HTML tags over and over we will define the header and the footer in the ha_html.h file as follows: #define HEADER1 "<html><head><title>"#define HEADER2 "</title></head><body><table border=1>\n"#define FOOTER "</table></body></html>"#define FOOTER_LEN ((int)(sizeof(FOOTER)-1)) As we want a header to include a table name we have split it in two parts. Now, we can create our table: int ha_html::create(const char *name, TABLE *table_arg, HA_CREATE_INFO *create_info){ char buf[FN_REFLEN+10]; strcpy(buf, name); strcat(buf, *bas_ext()); We start by generating a filename. The "table name" that the storage engine gets is not the original table name, it is converted to be a safe filename. All "troublesome" characters are encoded, and the database name is included and separated from the table name with a slash. It means we can safely use name as the filename and all we need to do is to append an extension. Having the filename, we open it and write our data: FILE *f = fopen(buf, "w"); if (f == 0) return errno; fprintf(f, HEADER1); write_html(f, table_arg->s->table_name.str); fprintf(f, HEADER2 "<tr>"); First, we write the header and the table name. Note that we did not write the value of the name argument into the header, but took the table name from the TABLE_SHARE structure (as table_arg->s->table_name.str), because name is mangled to be a safe filename, and we would like to see the original table name in the HTML page title. Also, we did not just write it into the file, we used a write_html() function—this is our utility method that performs the necessary entity encoding to get a well-formed HTML. But let's not think about it too much now, just remember that we need to write it, it can be done later. Now, we iterate over all fields and write their names wrapped in <th>...</th> tags. Again, we rely on our write_html() function here: for (uint i = 0; i < table_arg->s->fields; i++) { fprintf(f, "<th>"); write_html(f, table_arg->field[i]->field_name); fprintf(f, "</th>"); } fprintf(f, "</tr>"); fprintf(f, FOOTER); fclose(f); return 0;} Done, an empty table is created. Opening it is easy too. We generate the filename and open the file just as in the create() method. The only difference is that we need to remember the FILE pointer to be able to read the data later, and we store it in fhtml, which has to be a member of the ha_html object: int ha_html::open(const char *name, int mode, uint test_if_locked){ char buf[FN_REFLEN+10]; strcpy(buf, name); strcat(buf, *bas_ext()); fhtml = fopen(buf, "r+"); if (fhtml == 0) return errno; When parsing an HTML file we will often need to skip over known patterns in the text. Instead of using a special library or a custom pattern parser for that, let's try to use scanf()—it exists everywhere, has a built-in pattern matching language, and it is powerful enough for our purposes. For convenience, we will wrap it in a skip_html() function that takes a scanf() format and returns the number of bytes skipped. Assuming we have such a function, we can finish opening the table: skip_html(fhtml, HEADER1 "%*[^<]" HEADER2 "<tr>"); for (uint i = 0; i < table->s->fields; i++) { skip_html(fhtml, "<th>%*[^<]</th>"); } skip_html(fhtml, "</tr>"); data_start = ftell(fhtml); We skip the first part of the header, then "everything up to the opening angle bracket", which eats up the table name, and the second part of the header. Then we skip individual row headers in a loop and the end of row </tr> tag. In order not to repeat this parsing again we remember the offset where the row data starts. At the end we allocate an HTML_SHARE and initialize lock objects: share = find_or_create_share(name, table); if (share->use_count++ == 0) thr_lock_init(&share->lock); thr_lock_data_init(&share->lock,&lock,NULL); return 0;} Closing the table is simple, and should not come as a surprise to us: int ha_html::close(void){ fclose(fhtml); if (--share->use_count == 0) thr_lock_delete(&share->lock); return 0;}
Read more
  • 0
  • 0
  • 1238

article-image-there-more-order-clause-sorting-column
Packt
11 Aug 2010
3 min read
Save for later

There is More to the ORDER BY Clause than Sorting a Column

Packt
11 Aug 2010
3 min read
(For more resources on similar content, see here.) If all you need in a SELECT query is ordering all the data in a table by just one of its column, all you need to do is putting the column in the query’s ORDER BY clause. SELECT TITLE, ISBN, PUBLISH_DATEFROM BOOKORDER BY ISBN DESC What if you need to order by more than column, i.e. hierarchical ordering? Again, just have all the ordering columns in the ORDER BY. You need to sequence them to command the rank of ordering: The first column is the primary; the second column, secondary; and so forth, next columns down the ordering rank. SELECT TITLE, ISBN, PUBLISH_DATEFROM BOOKORDER BY TITLE, ISBN, PUBLISH_DATE DESC But, what if the PUBLISH_DATE’s data format is MM-YYYY-DD? Which one of 12-2010-20, 12-2009-30, and 10-2010-01, is higher or lower than the others? This format is not good for ordering, as you need to order the dates by their year, then by month, and lastly by date. This article shows how to solve various ordering cases in SELECT queries. Derived Let’s first have a look at the SELECT query for solving the PUBLISH_DATE ordering. (All SELECT queries in this article are tested to run correctly in Oracle database.) If you run: SELECT PUBLISH_DATEFROM BOOKORDER BY PUBLISH_DATE DESC On the following publish dates: Its query output is: While we expect its output should be: We need to order by year (YYYY), then by month (MM), and lastly by date (DD). SELECT * FROM BOOKORDER BY SUBSTR(PUBLISH_DATE, 4, 4) DESC, SUBSTR(PUBLISH_DATE, 1, 2) DESC, SUBSTR(PUBLISH_DATE, 8,2) DESC This kind of derived 'column' for ordering can be applied to other than date. Translated What if the data format of the PUBLISH_DATE is MMM-YYYY-DD? The month is its first three characters name, not its number. Is then JAN-2009-01 higher or lower than APR-2010-01? We need to translate the month name into number using a translation reference table: Our SELECT query now becomes: SELECT PUBLISH_DATE, MONTH_NAME, MONTH_NO FROM BOOK, MONTH_TRANSLATION_REFWHERE SUBSTR(PUBLISH_DATE,1,3) = MONTH_NAMEORDER BY SUBSTR(PUBLISH_DATE, 4, 4) DESC, MONTH_NO DESC, SUBSTR(PUBLISH_DATE, 8,2) DESC If you run the query on the following publish dates: The query output is: You might have noticed that the query also applies the 'derived' ordering from the previous example. The month translation reference table we use in this example is based on a common predefined convention, that Jan is first (1); February is second (2), and so on. You are not limited to this kind of table; you can set up any custom translation reference table to suit your ordering need.
Read more
  • 0
  • 0
  • 2567

article-image-author-podcast-ronald-rood-discusses-birth-oracle-scheduler
Packt
05 Aug 2010
1 min read
Save for later

Author Podcast - Ronald Rood discusses the birth of Oracle Scheduler

Packt
05 Aug 2010
1 min read
You can listen to the podcast here, or hit play in the media player below.
Read more
  • 0
  • 0
  • 991
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at €14.99/month. Cancel anytime
article-image-packt-make-epub-downloads-available-its-website
Packt
20 Jul 2010
2 min read
Save for later

Packt to make ePub downloads available from its website

Packt
20 Jul 2010
2 min read
ePub (short for electronic publication) is a free and open eBook standard by the International Digital Publishing Forum (IDPF). ePub allows greater flexibility with content. With an ePub-formatted book, the display of text can be optimized for the reader's device.  Other ePub features include improved design and layout, Inline raster and vector images, better search functionality, while enabling DRM protection and embedded metadata. The company’s marketing manager Damian Carvill said “Packt listens to all feedback from customers and attempts to act on it accordingly. After receiving a number of requests, we made it a priority to convert all of our existing eBooks into the ePub format. All of our future books, and we hope to publish over 20 in July, will also be available in this popular eBook format.” Packt is inviting all customers, who have purchased an eBook, to download the ePub version of the book and enjoy its features and great flexibility. Alternatively, customers can sample the ePub format by accessing one of Packt’s free eBooks. All purchases of new eBooks from today will be made available to download as an ePub file, as well as the standard PDF. Packt ePub formatted eBooks are available from Tuesday 20th July 2010. To access your first Packt ePub eBook, either previously purchased or one of Packt’s free eBooks, please log into www.PacktPub.com and go to My account.  
Read more
  • 0
  • 0
  • 4190

article-image-use-templates-report-using-birt
Packt
19 Jul 2010
2 min read
Save for later

Use of Templates in Report using BIRT

Packt
19 Jul 2010
2 min read
Templates Templates are predesigned report layouts. Building a report from a template Templates, for the most part, take the leg work out of having to recreate the same layout over and over again.With templates, we could either use the existing, canned Listing Report template or build our own, saving your time for more complex tasks. It's as simple as creating a new report, using a template, and following the Cheat Sheets. In this example, we are going to use a Grouped Listing report template to create a simple Employee Sales Detail report, which we will use as a target for a drill down report. Create a new report called Employee Sales Details Report.rptDesign.Do not go through the wizard; stop at the template selection dialog. On the New Report – Template screen, select Grouped Listing report. When the report design opens, the Cheat Sheet is open on the right-hand side. It lays out a series of steps to create a basic report. As we already have the data source created in our library, go ahead and add the data source from the library, and click on the skip option for the data source hint. For the dataset, click on the Click to perform option. Name the dataset Get Employee Sales, and use the following query: select EMPLOYEES.EMPLOYEENUMBER, EMPLOYEES.LASTNAME || ', ' || EMPLOYEES.FIRSTNAME name, ORDERDETAILS.PRICEEACH sales, ORDERS.ORDERDATEfrom EMPLOYEES, CUSTOMERS, ORDERS, ORDERDETAILSwhere ORDERS.ORDERNUMBER = ORDERDETAILS.ORDERNUMBER and EMPLOYEES.EMPLOYEENUMBER = CUSTOMERS. SALESREPEMPLOYEENUMBER and ORDERS.CUSTOMERNUMBER = CUSTOMERS.CUSTOMERNUMBER and ORDERS.ORDERDATE between ? and ? Create two report parameters called startDate and endDate, and bind them to the data set parameters. For startDate, use the default value of 2005-01-01, and for endDate, use the default parameter of 2005-04-01. When we are back at the Cheat Sheet, we click on Click to Complete for the dataset. For the Edit Date Binding cheat sheet section, drag the fields over like next screenshot. Make the Group Header row appear with a silver background and the text appear in bold. When we drag EmployeeNumber over, it will automatically create the group on the EmployeeNumber express. Just be sure to drag it to the GroupHeader section as illustrated in the screenshot. Select the column with the Name and OrderDate fields, and select Suppress duplicates.
Read more
  • 0
  • 0
  • 2437

article-image-use-stylesheets-report-designing-using-birt
Packt
17 Jul 2010
3 min read
Save for later

Use of Stylesheets for Report Designing using BIRT

Packt
17 Jul 2010
3 min read
Stylesheets BIRT, being a web-based reporting environment, takes a page from general web development toolkits by importing stylesheets. However, BIRT stylesheets function slightly differently to regular stylesheets in a web development environment. We are going to add on to the Customer Orders report we have been working with, and will create some styles that will be used in this report. Open Customer Order.rptDesign. Right-click on the getCustomerInformation dataset and choose Insert into Layout. Modify the table visually to look like the next figure. Create a new dataset called getCustomerOrders using the following query: //insert code 1 Link the dataset parameter to rprmCustomerID. Save the dataset, right-click on it, and select Insert to layout. Select the first ORDERNUMBER column. Under the Property Editor, Select Advanced. In the Property Editor, go to the Suppress duplicates option, and change it to true. This will prevent the OrderNumber data item from repeating the value it displays down the page. In the Outline, right-click on Styles and choose New Style…. In the Pre-Defined Style drop down, choose table-header. A predefined style is an element that is already defined in the BIRT report. When selecting a predefined style, this will affect every element of that type within a report. In this case, for every table in the report, the table header will have this style applied. Under the Font section, apply the following settings: Font: Sans-Serif Font Color: White Size: Large Weight: Bold Under the Background section, set the Background Color to >b>Black. Click OK. Now, when we run the report, we can see that the header line is formatted with a black background and white font. Custom stylesheets In the example we just saw, we didn't have to apply this style to any element, it was automatically applied to the header of the order details table as it was using a predefined style. This would be the case for any table that had the header row populated with something and the same is the case for any of the predefined styles in BIRT. So next, let's look at a custom defined style and apply it to our customer information table. Right-click on the Styles section under the Outline tab and create a new style. Under the Custom Style textbox, enter CustomerHeaderInfo. Under the Font section, enter the following information: Font: Sans Serif Color: White Size: Large Weight: Bold Under the Background section, set the Background Color to Gray. Under the Box section, enter 1 points for all sections. Under the Border section, enter the following information: Style (All): Solid Color (All): White Width (All): Thin Click OK and then click Save. Select the table which contains the customer information. Select the first column. Under the Property Editor, in the list box for the Styles, select CustomerHeaderInfo. The preview report will look like the following screenshot: Right-click on the Styles section, and create a new custom style called CustomerHeaderData. Under Box, put in 1 points for all fields. Under Border, enter the following information: Style – Top: Solid Style – Bottom: Solid Color (All): Gray Click OK. Select the Customer Information table. Select the second column. Right-click on the column selector and select Style | Apply Style | CustomHeaderData. The finished report should look something like the next screenshot:
Read more
  • 0
  • 0
  • 3352

article-image-creating-themes-report-using-birt
Packt
17 Jul 2010
3 min read
Save for later

Creating Themes for a Report using BIRT

Packt
17 Jul 2010
3 min read
Creating themes Using the power of stylesheets and libraries, one has the ability to apply general formatting to an entire report design using themes and reuse these among different reports. Themes provide a simple mechanism to apply a wide range of styles to an entire report design without the need to manually apply them. The following example will move the styles that we have created in our library and will show how to apply them to our report using a theme. For each of the styles we have created, select them under the Outline tab and choose Export to Library…. Choose the ClassicCarsLibrary.rptLibrary file. All of the styles will reside under the defaultTheme themes section, so select this from the drop-down list that appears next to the Theme option. Repeat these steps for all styles we have created in Customer Orders.rptDesign. Delete all of the styles stored in the Customer Orders.rptDesign file. You will notice all the styles disappear from the report designer. In the Outline tab, under the Customer Orders.rptDesign file, select the root element titled Customer Orders.rptDesign. Right-click the element and select Refresh Library. The library should already be shared since we built the report using the library's data source and datasets. If it is not, open the Resource Explorer, choose ClassicCarsLibrary.rptLibrary, right-click and choose Use Library. Under the Property Editor, change the Themes drop down to ClassicCarsLibrarydefaultTheme. When we apply the theme, we will see the detail table header automatically apply the style for table-header. Apply the remaining custom styles to the two columns in the customer information section and the order detail row. Now, we know that we can create several different themes by grouping styles together in libraries. So, when developing reports, you can create several different looks that can be applied to reports, simply by applying themes to reports with the help of libraries. Using external CSS stylesheets Another stylesheet feature is the ability to use external stylesheets and simply link to them. This works out very well when report documents are embedded into existing web portals by using the portals stylesheets to keep a consistent look and feel. This creates a sense of uniformity in the overall site. Imagine that our graphics designer gives us a CSS file and asks us to design our reports around it. There are two ways one can use CSS files in BIRT: Importing CSS files Using CSS as a resource In the following examples we are going to illustrate both scenarios. I have a CSS file containing six styles—five styles that are for predefined elements in reports and one style that is a custom style. The following is the CSS stylesheet for the given report: .page { background-color: #FFFFFF; font-family: Verdana, Arial, Helvetica, sans-serif; font-size: 12px; line-height: 24px; color: #336699;}.table-group-footer-1 { font-family: Verdana, Arial, Helvetica, sans-serif; font-size: 12px; line-height: 24px; color: #333333; background-color: #FFFFCC;}.title { font-family: Verdana, Arial, Helvetica, sans-serif; font-size: 24px; line-height: 40px; background-color: #99CC00; color: #003333; font-weight: bolder;}.table-header { font-family: Verdana, Arial, Helvetica, sans-serif; font-size: 20px; background-color: #669900; color: #FFFF33;}.table-footer { font-family: Arial, Helvetica, sans-serif; font-size: 14px; font-weight: bold; line-height: 22px; color: #333333; background-color: #CCFF99;}
Read more
  • 0
  • 0
  • 3984
article-image-creating-reporting-site-using-birt-extension
Packt
16 Jul 2010
3 min read
Save for later

Creating a Reporting Site using BIRT-An Extension

Packt
16 Jul 2010
3 min read
Bug status The next report we are going to undertake is the report to show open issues compared to all bugs. What we are trying to build here is a graph that shows, by category, the bugs that are open in relation to the total number of bugs for that category. No other grouping is required. Under the graph, we also want to show a list of open bugs that we can click on and have them drill down to the detail for that bug so that we can see the history of changes to that bug. What this means is we need a bar graph showing all the status. Open BugzillaReportsTemplate.rptTemplate and save as bugStatusReport.rptDesign. Drag getAllBugs over to the Report Designer. Delete all columns except the bug_status field and short_desc field. In the table, group data by the components' name. With the new category created in the Table, insert a new column on the right. Delete the header labels. Move the data fields to look like the following screenshot: In the Outline tab, select the root element. Apply bugZillaTheme. In the group header row with the name, apply the DetailTableHeader style. In the Detail row, apply the DetailTableRow style. In the header row, select all the cells and merge them. In to the new merged cell, insert a chart. Select a Bar Chart and change the Output Format to PNG. Open the Select Data tab. Set the Inherit Data from Container drop-down list to Inherit Columns Only. Drag the bug_status field to the Optional Y Series Grouping slot. Drag the name field to the Category (X) Series slot. Click on the Edit group and sorting button. In the Group and sorting dialog, check the Enabled checkbox. Set the Type to Text. Set the Interval to 0. Set the Aggregate Expression to Count. As the Value (Y) Series, enter 1. Under the Format Chart tab, go to Title. Enter the title as Bug Status Report. Select the Axis option. Under X-Axis, check the Stagger checkbox. Click Finish. Resize the chart to fit the number of categories. The last thing we need to do is add the drill-through from the descriptions to the bug detail. Select the short_desc data item in the report designer. Under the Property Editor, select the Hyperlink tab. Click on the Edit... button next to Link To. From the Hyperlink dialog, select the Drill-through as Hyperlink type. Select BugzillaDetailReport.rptDesign as the target report. Set up the target report parameter bugID to be linked to row["bug_id"]. Click OK and save the report.
Read more
  • 0
  • 0
  • 1636

Packt
12 Jul 2010
10 min read
Save for later

Understanding ShapeSheet™ in Microsoft Visio 2010

Packt
12 Jul 2010
10 min read
In this article by David J. Parker, author of Microsoft Visio 2010 Business Process Diagramming and Validation, we will discuss Microsoft Visio ShapeSheet™ and the key sections, rows, and cells, along with the functions available for writing ShapeSheet™ formulae, where relevant for structured diagrams. Microsoft Visio is a unique data diagramming system, and most of that uniqueness is due to the power of the ShapeSheet, which is a window on the Visio object model. It is the ShapeSheet that enables you to encapsulate complex behavior into apparently simple shapes by adding formulae to the cells using functions. The ShapeSheet was modeled on a spreadsheet, and formulae are entered in a similar manner to cells in an Excel worksheet. Validation rules are written as quasi-ShapeSheet formulae so you will need to understand how they are written. Validation rules can check the contents of ShapeSheet cells, in addition to verifying the structure of a diagram. Therefore, in this article you will learn about the structure of the ShapeSheet and how to write formulae. Where is the ShapeSheet? There is a ShapeSheet behind every single Document, Page, and Shape, and the easiest way to access the ShapeSheet window is to run Visio in Developer mode. This mode adds the Developer tab to the Fluent UI, which has a Show ShapeSheet button. The drop-down list on the button allows you to choose which ShapeSheet window to open. Alternatively, you can use the right-mouse menu of a shape or page, or on the relevant level within the Drawing Explorer window as shown in the following screenshot: The ShapeSheet window, opened by the Show ShapeSheet menu option, displays the requested sections, rows, and cells of the item selected when the window was opened. It does not automatically change to display the contents of any subsequently selected shape in the Visio drawing page—you must open the ShapeSheet window again to do that. The ShapeSheet Tools tab, which is displayed when the ShapeSheet window is active, has a Sections button on the View group to allow you to vary the requested sections on display. You can also open the View Sections dialog from the right-mouse menu within the ShapeSheet window. You cannot alter the display order of sections in the ShapeSheet window, but you can expand/collapse them by clicking the section header. The syntax for referencing the shape, page, and document objects in ShapeSheet formula is listed in the following table. Object ShapeSheet formula Comment Shape Sheet.n! Where n is the ID of the shape Can be omitted when referring to cells in the same shape. Page.PageSheet ThePage! Used in the ShapeSheet formula of shapes within the page.   Pages[page name]! Used in the ShapeSheet formula of shapes in other pages. Document.DocumentSheet TheDoc! Used in the ShapeSheet formula in pages or shapes of the document. What are sections, rows, and cells? There are a finite number of sections in a ShapeSheet, and some sections are mandatory for the type of element they are, whilst others are optional. For example, the Shape Transform section, which specifies the shape's size (that is, angle and position) exists for all types of shapes. However, the 1-D Endpoints section, which specifies the co-ordinates of either end of the line, is only relevant, and thus displayed for OneD shapes. Neither of these sections is optional, because they are required for the specific type of shape. Sections like User-defined Cells and Shape Data are optional and they may be added to the ShapeSheet if they do not exist already. If you press the Insert button on the ShapeSheet Tools tab, under the Sections group, then you can see a list of the sections that you may insert into the selected ShapeSheet. In the above example, User-defined Cells option is grayed out because this optional section already exists. It is possible for a shape to have multiple Geometry, Ellipse, or Infinite line sections. In fact, a shape can have a total of 139 of them. Reading a cell's properties If you select a cell in the ShapeSheet, then you will see the formula in the formula edit bar immediately below the ribbon. Move the mouse over the image to enlarge it. You can view the ShapeSheet Formulas (and I thought the plural was formulae!) or Values by clicking the relevant button in the View group on the ShapeSheet Tools ribbon. Notice that Visio provides IntelliSense when editing formulae. This is new in Visio 2010, and is a great help to all ShapeSheet developers. Also notice that the contents of some of the cells are shown in blue text, whilst others are black. This is because the blue text denotes that the values are stored locally with this shape instance, whilst the black text refers to values that are stored in the Master shape. Usually, the more black text you see, the more memory efficient the shape is, since less is needed to be stored with the shape instance. Of course, there are times when you cannot avoid storing values locally, such as the PinX and PinY values in the above screenshot, since these define where the shape instance is in the page. The following VBA code returns 0 (False): ActivePage.Shapes("Task").Cells("PinX").IsInherited But the following code returns -1 (True) : ActivePage.Shapes("Task").Cells("Width").IsInherited The Edit Formula button opens a dialog to enable you to edit multiple lines, since the edit formula bar only displays a single line, and some formulae can be quite large. You can display the Formula Tracing window using the Show Window button in the Formula Tracing group on the ShapeSheet Tools present in Design tab. You can decide whether to Trace Dependents, which displays other cells that have a formula that refers to the selected cell or Trace Precedents, which displays other cells that the formula in this cell refers to. Of course, this can be done in code too. For example, the following VBA code will print out the selected cell in a ShapeSheet into the Immediate Window: Public Sub DebugPrintCellProperties ()'Abort if ShapeSheet not selected in the Visio UI If Not Visio.ActiveWindow.Type = Visio.VisWinTypes.visSheet Then Exit Sub End IfDim cel As Visio.Cell Set cel = Visio.ActiveWindow.SelectedCell'Print out some of the cell properties Debug.Print "Section", cel.Section Debug.Print "Row", cel.Row Debug.Print "Column", cel.Column Debug.Print "Name", cel.Name Debug.Print "FormulaU", cel.FormulaU Debug.Print "ResultIU", cel.ResultIU Debug.Print "ResultStr("""")", cel.ResultStr("") Debug.Print "Dependents", UBound(cel.Dependents)'cel.Precedents may cause an errorOn Error Resume Next Debug.Print "Precedents", UBound(cel.Precedents) End Sub In the previous screenshot, where the Actions.SetDefaultSize.Action cell is selected in the Task shape from the BPMN Basic Shapes stencil, the DebugPrintCellProperties macro outputs the following: Section 240 Row 2 Column 3 Name Actions.SetDefaultSize.Action FormulaU SETF(GetRef(Width),User.DefaultWidth)+SETF(GetRef(Height),User.DefaultHeight) ResultIU 0 ResultStr("") 0.0000 Dependents 0 Precedents 4     Firstly, any cell can be referred to by either its name, or section/row/column indices, commonly referred to as SRC. Secondly, the FormulaU should produce a ResultIU of 0, if the formula is correctly formed and there is no numerical output from it. Thirdly, the Precedents and Dependents are actually an array of referenced cells. Can I print out the ShapeSheet settings? You can download and install the Microsoft Visio SDK from the Visio Developer Center (visit http://msdn.microsoft.com/en-us/office/aa905478.aspx). This will install an extra group, Visio SDK, on the Developer ribbon and one extra button Print ShapeSheet. I have chosen the Clipboard option and pasted the report into an Excel worksheet, as in the following screenshot: The output displays the cell name, value, and formula in each section, in an extremely verbose manner. This makes for many rows in the worksheet, and a varying number of columns in each section. What is a function? A function defines a discrete action, and most functions take a number of arguments as input. Some functions produce an output as a value in the cell that contains the formula, whilst others redirect the output to another cell, and some do not produce a useful output at all. The Developer ShapeSheet Reference in the Visio SDK contains a description of each of the 197 functions available in Visio 2010, and there are some more that are reserved for use by Visio itself. Formulae can be entered into any cell, but some cells will be updated by the Visio engine or by specific add-ons, thus overwriting any formula that may be within the cell. Formulae are entered starting with the = (equals) sign, just as in Excel cells, so that Visio can understand that a formula is being entered rather than just a text. Some cells have been primed to expect text (strings) and will automatically prefix what you type with =" (equals double-quote) and close with "(double-quote) if you do not start typing with an equal sign. For example, the function NOW(), returns the current date time value, which you can modify by applying a format, say, =FORMAT(NOW(),"dd//MM/YYYY"). In fact, the NOW() function will evaluate every minute unless you specify that it only updates at a specific event. You could, for example, cause the formula to be evaluated only when the shape is moved, by adding the DEPENDSON() function: =DEPENDSON(PinX,PinY)+NOW() The normal user will not see the result of any values unless there is something changing in the UI. This could be a value in the Shape Data that could cause linked Data Graphics to change. Or there could be something more subtle, such as the display of some geometry within the shape, like the Compensation symbol in the BPMN Task shape. In the above example, you can see that the Compensation right-mouse menu option is checked, and the IsForCompensation Shape Data value is TRUE. These values are linked, and the Task shape itself displays the two triangles at the bottom edge. The custom right-mouse menu options are defined in the Actions section of the shape's ShapeSheet, and one of the cells, Checked, holds a formula to determine if a tick should be displayed or not. In this case, the Actions.Compensation.Checked cell contains the following formula, which is merely a cell reference: =Prop.BpmnIsForCompensation Prop is the prefix used for all cells in the Shape Data section because this section used to be known as Custom Properties. The Prop.BpmnIsForCompensation row is defined as a Boolean (True/False) Type, so the returned value is going to be 1 or 0 (True or False). Thus, if you were to build a validation rule that required a Task to be for Compensation, then you would have to check this value. You will often need to branch expressions using the following: IF(logical_expression, value_if_true, value_if_false)
Read more
  • 0
  • 0
  • 10032

article-image-indexing-mysql-admin
Packt
29 Jun 2010
6 min read
Save for later

Indexing in MySQL Admin

Packt
29 Jun 2010
6 min read
Using prefix primary keys In this example we will add indexes to two tables that are almost identical. The only difference will be the definition of their primary keys. You will see the difference in space consumption for secondary indexes between a regular full column primary key and a prefix primary key. The sample table structure and data are designed to demonstrate the effect very evidently. In real-world scenarios the effect will most certainly be less severe. Getting ready... Connect to the database server with your administrative account. How to do it... Download the sample script for this article from the book's website and save it to your local disk. In the example below, we will assume it is stored in /tmp/idxsizeestimate_sample.sql. Create a new database and make it the default database:CREATE DATABASE pktests; USE pktests; Import the sample data from the downloaded file. When done, you will be presented with some statistics about the two tables loaded. Note that both tables have an Index Length of 0. SOURCE /tmp/idxsizeestimate_sample.sql; Now with the sample tables present, add an index to each of them: ALTER TABLE LongCharKey ADD INDEX IDX_PAY_10(Payload(10)); ALTER TABLE LongCharKey10 ADD INDEX IDX_PAY_10(Payload(10)); Display the data and index sizes of the tables now:SHOW TABLE STATUS LIKE 'LongCharKey%'; Add another index to each table to make the difference even more evident:ALTER TABLE LongCharKey ADD INDEX IDX2_PAY_10(Payload(10));ALTER TABLE LongCharKey10 ADD INDEX IDX2_PAY_10(Payload(10)); Display the data and index sizes of the tables again and compare with the previous values:SHOW TABLE STATUS LIKE 'LongCharKey%'; Name Rows Data Length Index Length Index/Data Ratio LongCharKey 50045 30392320 28868608 94.99% LongCharKey10 50045 29949952 3178496 10.61%   With the second index added, the difference in index length becomes even clearer.   How it works... Executing the downloaded script will set up two tables with the following structures: CREATE TABLE `LongCharKey` (`LongChar` char(255) NOT NULL,`Payload` char(255) DEFAULT NULL,PRIMARY KEY (`LongChar`)) ENGINE=InnoDB DEFAULT CHARSET=latin1;CREATE TABLE `LongCharKey10` (`LongChar` char(255) NOT NULL,`Payload` char(255) DEFAULT NULL,PRIMARY KEY (`LongChar`(10))) ENGINE=InnoDB DEFAULT CHARSET=latin1; The two tables are almost identical, except for the primary key definition. They are pre-filled with 50,000 records of sample data. The tables are populated with exactly the same 50,000 records of pseudo-random data. The Payload column is filled with sequences of 255 random letters each. The LongChar column is filled with a sequential number in the first 10 characters and then filled up to use all remaining 245 character with the same sort of random data.SELECT LEFT(LongChar,20), LEFT(Payload, 20) from LongCharKey LIMIT 5; LEFT(LongChar,20) LEFT(Payload, 20) 0000000000KEAFAYVEJD RHSKMEJITOVBPOVAGOGM 0000000001WSSGKGMIJR VARLGOYEONSLEJVTVYRP 0000000002RMNCFBJSTL OVWGTTSHEQHJHTHMFEXV 0000000003SAQVOQSINQ AHDYUXTAEWRSHCLJYSMW 0000000004ALHYUDSRBH DPLPXJVERYHUOYGGUFOS While the LongKeyChar table simply marks the whole LongChar column as a primary key with its entire 255 characters length, the LongCharKey10 table limits the primary key to the first 10 characters of that column. This is perfectly fine for this table, because the test data was crafted to be unique in this range. Neither one of the two tables has any secondary indexes defined. Looking at some relevant table data shows they are equally big (some columns left out for brevity): SHOW TABLE STATUS LIKE 'LongCharKey%'; Name Rows Data Length Index Length LongCharKey 50045 30392320 0 LongCharKey10 50045 29949952 0 With each index added, the Index Length for the first table will increase significantly, while for the second one its growth is much slower. In case of the LongCharKey table, each secondary index record will carry around with it a complete copy of the LongChar column because it is the primary key without limitation. Assuming a single byte character encoding, this means every secondary index record is blown up in size by 255 bytes on top of the 10 bytes needed for the actual index entry. This means a whole kilobyte is spent just for the primary key reference for every 4 records! In contrast to that, the primary key definition of the LongCharKey10 table only includes the leading 10 characters of the LongChar column, making the secondary index entry 245 bytes shorter and thereby explaining the much slower growth upon adding further indexes. Choosing InnoDB primary key columns In the article introduction we promised to shed some light on how to choose your InnoDB primary key columns sensibly. Be advised that choosing good primary key columns is not an exact science—there are multiple aspects that influence this decision. Depending on your needs and preconditions you will want to prioritize them differently from one table to the next. Consider the following as general advice rather than hard rules that must be obeyed unconditionally. Getting ready In order to make reasonable decisions on primary key columns, it is important to have a very clear understanding of what the data looks like in the table at hand. If you already have existing data that is to be stored in an InnoDB table—for example in MyISAM format—it can be helpful to compare it with the criteria below. If you are planning a new schema, you might have to guess about some characteristics of the future data. As is often the case, the quality of your choices is directly proportional to how good those guesses are. This recipe is less strict step-by-step instructions that must be followed from top to bottom and should be considered a list of properties a good primary key should have, even though you might decide some of them do not apply to your actual environment. As a rule of thumb, however, a column that fulfills all or most of the attributes described below is most probably a sensible choice for a primary key. See the How it works... section for details on the individual items. How to do it... Identify unique attributes: This is an absolute (technical) requirement for primary keys in general. Any data attribute that is not strictly guaranteed to be free of duplicates cannot be used alone as a primary key. Identify immutable attributes: While not absolutely necessary, a good primary key is never changed once it has been assigned. For all intents and purposes, you should avoid columns that have even a small chance of being changed for existing records Use reasonably short keys: This is the "softest" criterion of all. In general, longer keys have negative impacts on overall database performance—the longer the worse. Also, consider a prefix primary key. See Using prefix primary keys earlier in this article for more information. Prefer single-column keys: Even though nothing prevents you from choosing a composite primary key (a combination of columns that together form the uniqueness), this can easily become a hassle to work with, especially when handling foreign keys. Consider the clustered index nature of the primary key: As InnoDB's primary key is also clustered, you should take this special nature into account as well. It can speed up read access a lot, if you often have to query for key ranges, because disk seek times will be minimized.
Read more
  • 0
  • 0
  • 1247
article-image-oracle-environmental-variables-and-scripting
Packt
22 Jun 2010
10 min read
Save for later

Oracle: Environmental Variables and Scripting

Packt
22 Jun 2010
10 min read
(For more resources on Oracle, see here.) Unix scripting will be one of the most powerful tools in your arsenal and only with constant use will you become proficient. Your standards for scripting need to produce code that is robust, modular, consistent, and scalable. This article won't cover everything you need to know about scripting. There are many excellent lessons available from your operating system vendor. See the following documents for more information specific to how Oracle products behave in a Unix environment: Note: 131207.1 Subject: How to Set Unix Environment Variables Note: 1011994.6 Subject: How do Applications Act in Different Unix Shells. How to Integrate the Shell, SQL*Plus Scripts and PL/SQL in any Permutation? [ID 400195.1] (Excellent document) It is suggested that the reader should read the Oracle Database Concepts 11g Release 2 (11.2) of the documentation, the book every DBA should start with. Usually cron is used for scheduling in Unix and the AT command with Windows. For Oracle utilities that run outside the database (or which must only run when the database is down), Unix shell scripting is best used. A well-written script would parse /etc/oratab for a list of ORACLE_SIDs on a single server for tasks such as parsing log files, consistent backups, exports, dbverify, and RMAN. If you have a solitary database, then DBMS_SCHEDULER can be used with a combination of SQL and PL/SQL integrated with OS commands. Occasionally, DBAs rename the oraenv located in $ORACLE_HOME/bin when they have added custom code to the /usr/local/bin/oraenv (default location on Linux), so that they can make sure they know exactly which one is executed at run time. If you have any problems related to running a modified oraenv file, check which one is being executed and adjust the $PATH variable as needed. The following is a list of some of the Oracle-provided Unix commands with a quick synopsis of their purpose. Most of them are located in $ORACLE_HOME/bin: wrap—encrypts stored procedures for advanced use oerr—displays oracle errors. Syntax: oerr ora 12154 Sysresv—instance and shared memory segments Tkprof—formats output trace file into readable format Dbshut—shell script to shut down all instances dbstart—shell script to start up all instances at boot Dbhome—sets ORACLE_HOME Oraenv—sets environmental variables for ORACLE_SID trcasst—trace assistant Guidelines for scripting These are general suggestions for someone with some experience with a Unix operating system. You will need more background information to understand this section. The best sources of information will be the software vendor of your operating system, because there are small differences between the different versions and flavors of Unix. As with any code, you must test on a non-production system first, as inexperience may lead to unexpected results. Separating the configuration file Use the OS-provided default profile for environmental variables, but use a separate configuration file to set the $ORACLE_SID variable. There are several key environmental variables that are commonly used with Oracle software, all of which are found in the documentation specific to the operating system. Optimal Flexible Architecture (OFA) mentions setting the $ORACLE_SID in the profile, but if you have more than one database or Oracle product installed on a system, it is best done interactively. A default profile is loaded when you first log in to Unix. So if the $ORACLE_SID is loaded when you log on, what happens when you want to change ORACLE_SID(s)? This is when the environment becomes mixed. It just keeps appending the $PATH variable each time you execute the oraenv script. Which set of executables will you find first? It will be those executables for which you ran oraenv the first time. At this point I wanted to execute SQL*Plus out of the 11g directory, but was able to determine that the 10gR2 client was first in the $PATH. How did I know that? Use the which Unix command to find out. It is also a good practice to use a separate terminal window for each ORACLE_HOME. Normally, you will operate with multiple windows open, one for each ORACLE_HOME or ORACLE_SID in different colors to visually remind you which one is production. The example profile is provided in the code: example_profile.txt. The profile sets the entire user environment at first logon. This one is specific to the ksh or korn shell on the Linux operating system and will also work for bash. Differences in bash include that the line history is scrollable with the up and down arrows instead of vi commands. To set the ORACLE_SID and activate all of the variables located in profile, source the file oraenv (bash, Bourne, or korn shell) or coraenv (C shell). Source means the variables will be in effect for the entire session and not just the current line in the command window. You indicate source by putting a '.' (dot) in front of the file. As the oraenv file is located in /usr/local/bin (on Linux) and this location is in the $PATH, typing it at the command line works. Putting key Oracle files, such as oraenv, oratab, and oraInst.loc, in locations that ORACLE_HOMEwill not be affected by standard Oracle installations is also part of the OFA. The oraenv script is installed into /usr/local/ bin/ automatically when running .runInstaller for the first time. Notice the prompt that you will see if you use the command as in the profile listed above: A note about prompts: Every person will want to customize their own prompt so; look around for various versions that tickle your fancy. This one is better than most examples to compare to. Notice the small difference in the prompt before and after? ORACLE_SID is now defined because oraenv is executed, which also runs dbhome (also located in /usr/local/bin), but these scripts require a valid entry in /etc/oratab. If you type the ORACLE_SID incorrectly on Unix, this will be case sensitive. It will ask where the ORACLE_HOME is if there is no entry in oratab. Making an ORACLE_SID lowercase conforms to the OFA standard (see the install guide for your OS for more information). Some DBAs use an uppercase ORACLE_SID because it makes it more prominent for any type of naming convention and is meant to reduce human error. You can use an ORACLE_SID entry in the oratab file to reference other Oracle products such as the Grid Control Intelligent Agent. The ea, which is an ORACLE_SID in the following oratab example, is what I use to indicate the Enterprise Manager Agent. The third letter after the ORACLE_HOME (N) indicates when Oracle-supplied utilities (like db_shut and db_start) are not to execute against this ORACLE_HOME. I personally use the N for my own scripting to indicate which utilities shouldn't run against this ORACLE_HOME. What this does is take advantage of Oracle-provided files—oratab and oraenv—to accomplish other types of tasks. This is only a suggested use. There are other ways of setting environmental variables for non-database products. You will need to create a test database to work through all of the examples and practice scenarios. How should you create the test database? Use the Oracle-provided Database Configuration Assistant (DBCA) utility to create a test database. There are default templates provided that will work for most of the tasks outlined in this article. If you are interested in duplicating some of the advanced tasks (like Data Guard), then it will require the installation of the Enterprise Edition of Oracle Database. All tasks in this article were done with 11.1.0.7 version of Oracle Database with some references to 11.2.0.1, which had just been released. Host commands relative location This will be important as you begin scripting. Host commands are relative to the location of the executable. As a general rule, you should execute database-specific utilities (imp, exp, datapump, RMAN, and so forth) on the server where the database is located in the correct ORACLE_HOME. This reduces the amount of issues such as core dumps and version compatibilities. This is different from what is usually thought of as a client utilities such as SQL*Plus. There are exceptions to this rule, for it is recommended to run a compiled code (C, C++, Cobol) on a separate server rather than a database. See the following document for setting the TWO_TASK variable when using a separate node for compiled programs. TWO_TASK is an environmental variable. Subject: How to Customize Pro*C and Pro*Cobol Makefiles demo_proc.mk And demo_procob.mk On Linux/Unix [Doc ID: 602936.1]. Notice the WARNING! message that is set using the new 11g sqlnet.ora parameter SEC_USER_UNAUTHORIZED_ACCESS_BANNER. The sqlnet.ora file is part of the SQL*Net components of Oracle RDBMS, which handle the communication between clients and the database. oracle@nodename:/u01/app/oracle/admin/newdb[newdb]> sqlplus /nologSQL*Plus: Release 11.1.0.7.0 - Production on Thu Nov 5 19:00:292009Copyright (c) 1982, 2008, Oracle. All rights reserved.@> connect / as sysdba####################################################################WARNING! This computer system is the property of YOURORGANIZATIONand may be accessed only by authorized users.Unauthorized use of this system is strictly prohibited and may besubject to criminal prosecution.Connected.SYS@newdb> If you wanted to execute something that is available on the operating system level, then you would use a host command (either Windows or Unix), or on Unix the ! symbol. The output below shows that I am logged into the newdb as sys and lists (ls command) the files located in the $ORACLE_HOME/sqlplus/admin directory: Notice how the SQL prompt is populated with the ORACLE_SID and the username that is connected to the database. This is done by adding a line to the glogin.sql file, which can be done within SQL*Plus as shown below (I used the text editor vi): Host commands work based on the location of SQL*Plus. If you want to execute these same commands from a Windows desktop connecting to a remote Unix database, then it would require a Windows equivalent command like Notepad or another editor. If you have Cygwin installed and configured on the Windows desktop, then it would allow you to run Unix-equivalent commands like vi. Separating the variable part of the script into its own configuration file There are scripts that will need some variables set, but you don't necessarily want to use the profile to set a variable at every login. The variables may need to contain commands specific to applications such as RMAN, SQL*Plus or specific to certain tasks. This is where a configuration file comes in handy, which is a personal preference as to what you call them. Be sure not to use reserved keywords for variables, as that leads to unexpected results. In the example below, we use emal instead of the word email. To call this configuration file, you source it starting with the prerequisite . (dot) within a script. This file can also be used in the custom code section of the ioraenv file.
Read more
  • 0
  • 0
  • 4479

article-image-amazon-simpledb-versus-rdbms
Packt
08 Jun 2010
7 min read
Save for later

Amazon SimpleDB versus RDBMS

Packt
08 Jun 2010
7 min read
(For more resources on SimpleDB, see here.) We have all used a Relational Database Management System (RDBMS) at some point in our careers. These relational databases are ubiquitous and are available from a wide range of companies such as Oracle, Microsoft, IBM, and so on. These databases have served us well for our application needs. However, there is a new breed of applications coming to the forefront in the current Internet-driven and socially networked economy. The new applications require large scaling to meet demand peaks that can quickly reach massive levels. This is a scenario that is hard to satisfy using a traditional relational database, as it is impossible to requisition and provision the hardware and software resources that will be needed to service the demand peaks. It is also non-trivial and difficult to scale a normal RDBMS to hundreds or thousands of nodes. The overwhelming complexity of doing this makes the RDBMS not viable for these kinds of applications. SimpleDB provides a great alternative to an RDBMS and can provide a solution to all these problems. However, in order to provide this solution, SimpleDB makes some choices and design decisions that you need to understand in order to make an informed choice about the data storage for your application domain. No normalization Normalization is a process of organizing data efficiently in a relational database by eliminating redundant data, while at the same time ensuring that the data dependencies make sense. SimpleDB data models do not conform to any of the normalization forms, and tend to be completely de-normalized. The lack of need for normalization in SimpleDB allows you a great deal of flexibility with your model, and enables you to use the power of multi-valued attributes in your data. Let's look at a simple example of a database starting with a basic spreadsheet structure and then design it for an RDBMS and a SimpleDB. In this example, we will create a simple contact database, with contact information as raw data. ID First_Name Last_Name Phone_Num 101 John Smith 555-845-7854 101 John Smith 555-854-9885 101 John Smith 555-695-7485 102 Bill Jones 555-748-7854 102 Bill Jones 555-874-8654 The obvious issue is the repetition of the name data. The table is inefficient and would require care to update to keep the name data in sync. To find a person by his or her phone number is easy. SELECT * FROM Contact_Info WHERE Phone_Num = '555-854-9885' So let's analyze the strengths and weaknesses of this database design. SCORE-Raw data Strength Weakness Efficient storage   No Efficient search by phone number Yes   Efficient search by name   No Easy-to-add another phone number Yes   The design is simple, but as the name data is repeated, it would require care to keep the data in sync. Searching for phone numbers by name would be ugly if the names got out of sync. To improve the design, we can rationalize the data. One approach would be to create multiple phone number fields such as the following. While this is a simple solution, it does limit the phone numbers to three. Add e-mail and Twitter, and the table becomes wider and wider. ID First_Name Last_Name Phone_Num_1 Phone_Num_2 Phone_Num_3 101 John Smith 555-845-7854 555-854-9885 555-695-7485 102 Bill Jones 555-748-7854 555-874-8654   Finding a person by a phone number is ugly. SELECT * FROM Contact_Info WHERE Phone_Num_1 = '555-854-9885'OR Phone_Num_2 = '555-854-9885'OR Phone_Num_3 = '555-854-9885' Now let's analyze the strengths and weaknesses of this database design. SCORE-Rationalize data Strength Weakness Efficient storage Yes   Efficient search by phone number   No Efficient search by name Yes   Easy to add another phone number   No The design is simple, but the phone numbers are limited to three, and searching by phone number involves three index searches. Another approach would be to use a delimited list for the phone number as follows: ID First_Name Last_Name Phone_Nums 101 John Smith 555-845-7854;555-854-9885;555-695-7485 102 Bill Jones 555-748-7854;555-874-8654 This approach has the advantage of no data repetition and is easy to maintain, compact, and extendable, but the only way to find a record by the phone number is with a substring search. SELECT * FROM Contact_Info WHERE Phone_Nums LIKE %555-854-9885% This type of SQL forces a complete table scan. Do this with a small table and no one will notice, but try this on a large database with millions of records, and the performance of the database will suffer. SCORE-Delimited Data Strength Weakness Efficient storage Yes   Efficient search by phone number   No Efficient search by name Yes   Easy to add another phone number Yes   A delimited field is good for data that is of one type and will only be retrieved. The normalization for relational databases results in splitting up your data into separate tables that are related to one another by keys. A join is an operation that allows you to retrieve the data back easily across the multiple tables. Let's first normalize this data. This is the Person_Info table: ID First_Name Last_Name 101 John Smith 102 Bill Jones And this is the Phone_Info table: ID Phone_Num 101 555-845-7854 101 555-854-9885 101 555-695-7485 102 555-748-7854 102 555-874-8654 Now a join of the Person_Info table with the Phone_Info can retrieve the list of phone numbers as well as the e-mail addresses. The table structure is clean and other than the ID primary key, no data is duplicated. Provided Phone_Num is indexed, retrieving a contact by the phone number is efficient. SELECT First_Name, Last_Name, Phone_num, Person_Info.IDFROM Person_Info JOIN Phone_InfoON Person_Info.ID = Phone_Info.IDWHERE Phone_Num = '555-854-9885' So if we analyze the strengths and weaknesses of this database design, we get: SCORE-Relational Data Strength Weakness Efficient storage Yes   Efficient search by phone number Yes   Efficient search by name Yes   Easy to add another phone number Yes   While this is an efficient relational model, there is no join command in SimpleDB. Using two tables would force two selects to retrieve the complete contact information. Let's look at how this would be done using the SimpleDB principles. No joins SimpleDB does not support the concept of joins. Instead, SimpleDB provides you with the ability to store multiple values for an attribute, thus avoiding the necessity to perform a join to retrieve all the values. ID       101 First_Name=John Last_Name=Smith Phone_Num =555-845-7854Phone_Num =555-854-9885Phone_Num =555-695-7485 102 First_Name=Bill Last_Name=Jones Phone_Num =555-748-7854Phone_Num =555-874-8654 In the SimpleDB table, each record is stored as an item with attribute/value pairs. The difference here is that the Phone_Num field has multiple values. Unlike a delimited list field, SimpleDB indexes all values enabling an efficient search each value. SELECT * FROM Contact_Info WHERE Phone_Num = '555-854-9885' This SELECT is very quick and efficient. It is even possible to use Phone_Num multiple times such as follows: SELECT * FROM Contact_Info WHERE Phone_Num = '555-854-9885'OR Phone_Num = '555-748-7854' Let's analyze the strengths and weaknesses of this approach: SCORE-SimpleDB Data Strength Weakness Efficient storage Yes   Efficient search by phone number Yes   Efficient search by name Yes   Easy to add another phone number Yes  
Read more
  • 0
  • 0
  • 1548