Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

How-To Tutorials - Data

1204 Articles
article-image-mapping-oracle-warehouse-database-2
Packt
29 Sep 2009
5 min read
Save for later

Mapping in Oracle Warehouse Database

Packt
29 Sep 2009
5 min read
In this article, we will begin to see the real power and flexibility the Warehouse Builder provides us for loading a data warehouse. When we complete the mappings in this article, we will have a complete collection of objects and mappings. We can deploy and run these to build and load our data warehouse. The basic procedure to build a mapping is the same—start with adding a source and a target, and then include any operators in between needed for data flow and transformation. Let's start this article with the STORE dimension and we'll see some new operators that are involved in transformations.Let's begin by creating a new mapping called STORE_MAP. In the Design Center, we will right-click on the Mappings node of the ACME_DW_PROJECT | Databases | Oracle | ACME_DWH database and select New.... Enter STORE_MAP for the name of the mapping and we will be presented with a blank Mapping Editor window. In this window, we will begin designing our mapping to load data into the STORE dimension. Adding source and target operators Once the POS_TRANS_STAGE staging table is loaded with data, this can be used to load data into our dimensions and cube. We'll now use this POS_TRANS_STAGE table as our source table. Let's drag this table onto the mapping from the Explorer window. The target for this mapping is going to be the STORE dimension, so we'll drag this dimension from Databases | Oracle | ACME_DWH | Dimensions onto the mapping and drop it to the right of the POS_TRANS_STAGE table operator. Remember that we build our mappings from the left to the right, with source on the left and target on the right. We'll be sure to leave some space between the two because we'll be filling that in with some more operators as we proceed. Now that we have our source and target included, let's take a moment to consider the data elements we're going to need for our target and where to get them from the source. Our target for this mapping, the STORE dimension, has the following attributes for the STORE level for which we'll need to have source data: NAME STORE_NUMBER ADDRESS1 ADDRESS2 CITY STATE ZIP_POSTALCODE COUNTY REGION_NAME For the REGION level, we'll need data for the following attributes: NAME DESCRIPTION COUNTRY_NAME For the COUNTRY level, we'll need data for the following attributes: NAME DESCRIPTION The complete and fully expanded STORE dimension in our mapping appears like the following screenshot: We might be tempted to include the ID fields in the above list of data elements for populating, but these are the attributes that will be filled in automatically by the Warehouse Builder. The Warehouse Builder fills them using the sequence that was automatically created for us when we built the dimension. We don't have to be concerned with connecting any source data to them. Now that we know what we need to populate in our STORE dimension, let's turn our attention over to the POS_TRANS_STAGE dimension for the candidate data elements that we can use. In this table, we see the following data elements for populating data in our STORE dimension: STORE_NAME STORE_NUMBER STORE_ADDRESS1 STORE_ADDRESS2 STORE_CITY STORE_STATE STORE_ZIPPOSTALCODE STORE_REGION STORE_COUNTRY It is easy to see which of these attributes will be used to map data to attributes in the STORE level of the STORE dimension. They will map into the corresponding attributes in the dimension in the STORE group. We'll need to connect the following attributes together: STORE_NAME to NAME STORE_NUMBER to STORE_NUMBER STORE_ADDRESS1 to ADDRESS1 STORE_ADDRESS2 to ADDRESS2 STORE_CITY to CITY STORE_STATE to STATE STORE_ZIPPOSTALCODE to ZIP_POSTALCODE STORE_REGION to REGION_NAME There is another attribute in our STORE dimension that we haven't accounted for yet—the COUNTY attribute. We don't have an input attribute to provide direct information about it. It is a special case that we will handle after we take care of these more straightforward attributes and will involve the lookup table that we discussed earlier in the introduction of this article. We're not going to directly connect the attributes mentioned in the list by just dragging a line between each of them. There are some issues with the source data that we are going to have to account for in our mapping. Connecting the attributes directly like that would mean the data would be loaded into the dimension as is, but we have investigated the source data and discovered that much of the source data contains trailing blanks due to the way the transactional system stores it. Some of the fields should be made all uppercase for consistency. Given this additional information, we'll summarize the issues with each of the fields that need to be corrected before loading into the target and then we'll see how to implement the necessary transformations in the mapping to correct them: STORE_NAME, STORE_NUMBER: We need to trim spaces and change these attributes to uppercase to facilitate queries as they are part of the business identifier STORE_ADDRESS1, ADDRESS2, CITY, STATE, and ZIP_POSTALCODE: We need to trim spaces and change the STATE attribute to uppercase STORE_REGION: We need to trim spaces and change this attribute to uppercase All of these needs can be satisfied and we can have the desired effect by applying pre-existing SQL functions to the data via Transformation Operators.
Read more
  • 0
  • 0
  • 1883

article-image-configuring-mysql-linked-server-sql-server-2008
Packt Editorial Staff
16 Aug 2009
7 min read
Save for later

Configuring a MySQL linked server on SQL Server 2008

Packt Editorial Staff
16 Aug 2009
7 min read
Linking servers provides an elegant solution when you are faced with running queries against databases on distributed servers or looking at your distributed assets on disparate databases. This article by Dr. Jay Krishnaswamy explains how to set up a MySQL linked server on SQL Server 2008 Enterprise. Configuring a linked MySQL server as well as querying a table on the MySQL linked server is described. The reader would benefit reviewing the first article on this series on MySQL Servers. Introduction MS SQL servers always provided remote access to servers through RPC mechanisms, but they had the limitation of being confined to invoking procedures on remote locations. A linked server (a virtual server) may be considered a more flexible way of achieving the same thing, with the added benefits of remote table access and distributed queries. Microsoft manages the link mechanism via OLE DB technology. Specifically, an OLE DB data source points to the specific database that can be accessed using OLEDB. In this article, we will be creating a MySQL linked server on SQL Server 2008 and querying a database [TestMove] table shown in the next listing. In reviewing the previous article it may be noticed that the Employees tables were moved to MySQL database TestMove. In running the commands from the mysql> prompt it is assumed that the MySQL Server has been started. Listing 1: employees table in TestMove mysql> show tables; +--------------------+ | Tables_in_testmove | +--------------------+ | employees | +--------------------+ 1 row in set (0.09 sec) mysql> Creating an ODBC DSN for MySQL In the previous article on MySQL Servers cited earlier, a DSN was created for moving data. Essentially the same DSN can be used. Herein follows a brief review of the DSN MySQL_Link created along the same lines as in the previously referenced article. The ODBC driver used for creating this ODBC DSN is the one installed on the machine when the MySQL Server was installed as shown. The final interactive window where you may test the connectivity is shown in the next figure. You may notice that the database Testmove has been named in the ODBC DSN. The name MySQL_LINK is the ODBC DSN. When you close the window after clicking the OK button, an ODBC DSN item will be added to the System DSN tab of the ODBC wizard as shown. Steps to create a MySQL linked server from Management Studio Right click the Linked Servers node to display a drop-down menu as shown in the next figure. Click on New Linked Server...item. This brings up the New Linked Server window as shown. The window is all empty except for a default Provider. The very first thing to do is to provide a name for this linked server. Herein it is LINKED_ MYSQL. We will be using one of the providers [Microsoft OLE DB Provider for ODBC] supported by SQL Server 2008. You may access the list of OLE DB Providers in the Providers sub-folder of the Linked Servers. Provide the other details as shown in the next figure. Make sure they are entered exactly as shown or according to how you have created the database on MySQL Server. Click on the Security list item under General to the left. This opens the 'Security' page of the New Linked Server wizard as shown. Change the login from the default "Be made without using a security context" to "Be made using this security context". Enter remote login. In this case, it is "root" for the remote login and the password is the one used during the ODBC DSN (also the password for server authentication) creation. Make no changes to the Server Options page. Click OK. This creates a linked server Linked_MySQL as shown expanded in the Linked Server's node as shown. You may need to right-click and refresh the Linked Servers' node to see the new linked server. As you can see in the figure, the 'User' tables are not displayed.     Running Queries and reviewing results Running system stored procedures can provide various levels of information and the database can be queried using the four-part syntax and the openquery() method. Information on the linked server It is easy to find how the linked server is configured using system stored procedure sp_linkedsrvlogin on the SQL Server 2008. Open a Query window from File | New | Query Current Connection to open the query window and type in the following statement. The next figure shows the statement as well as the returned values. SQL Server 2008 querying has the intellisense report and this must be put to good use. Exec sp_linkedsrvlogin This shows all servers both local and remote as shown in the next figure. Information about the tables on the remote server can also be accessed by running a stored procedure. Executing the stored procedure sp_tables_ex as shown in the next figure (which displays the statement and the result of executing the stored procedure) can be used to obtain table information. Querying the table on the database Data in the table on the linked server can be queried using the openquery() function. The syntax for this function shown next is very simple. openquery ('linked server', 'query') The next figure shows the result of running the openquery() function on the Linked_MySQL linked server. Although it should be possible to query the linked server using the four-part syntax as in: Select * from LINKED_MYSQL...employees The above statement returns an error. This is probably a limitation of a combination of MSDASQL and the ODBC driver which does not provide the schema information correctly(this is just the author's opinion). Are Remote Procedure Calls (RPC) allowed? The easiest way to test this is to send out a call by running the following query against the linked server. Execute('Select FirstName, LastName from employees') at Linked_MYSQL If the linked server is not configured for RPC, then the result you get by running the above query is as shown in the next figure. Turn on RPC Earlier on we skipped the Server Options page of the linked server. Back in the Management Studio right click linked server LINKED_MYSQL and from the drop-down choose to look at the properties at the bottom of the list. This brings up the LINKED_MYSQL properties window as shown. Click on Server Options. In the Server Options page change the values of RPC and RPCOUT to true, default for both being false. Now run the query that produced the error previously. The result is displayed in the next figure. You might have noted that only two columns were returned from the employees table. This was deliberate as trying to get all the column would produce an error due partly to the data types of data stored in the table and their compatibility with MSDASQL and the ODBC driver (Again, an author's opinion). Creating Linked Server using TSQL While the linked server can be created using the built-in wizard of the Management Studio, it can also be created using TSQL statements as in the following listing (run both statements, the first one creates the linked server and the second the logins). Listing 2:  Exec master.dbo.sp_addlinkedserver @server=N'MySQlbyCode', @srvprodcut=N'MySQL', @provider=N'MSDASQL', @datasrc=N'MySQL_link' Exec master.dbo.sp_addlinkedserverlogin @server=N'MySQlbyCode', @locallogin=NULL, @rmtuser=N'root', @rmtpassword=N'<your password>' @rmtsrvname=N'localhost' Summary The article described the steps involved in configuring a MySql Linked server on SQL Server 2008 using the built-in New Linked Server wizard as well as TSQL. Method to query the linked server as well as enabling RPC were described. If you have read this article you may be interested in the following: MySQL Data Transfer using SQL Server Integration Services (SSIS) Transferring Data from MS Access 2003 to SQL Server 2008 Exporting data from MS Access 2003 to MySQL
Read more
  • 0
  • 0
  • 12655

article-image-mysql-data-transfer-using-sql-server-integration-services-ssis
Packt Editorial Staff
12 Aug 2009
8 min read
Save for later

MySQL Data Transfer using Sql Server Integration Services (SSIS)

Packt Editorial Staff
12 Aug 2009
8 min read
There are a large number of posts on various difficulties experienced while transferring data from MySQL using Microsoft SQL Server Integration Services. While the transfer of data from MySQL to Microsoft SQL Server 2008 is not fraught with any blocking issues, transfer of data from SQL Server 2008 to MySQL has presented various problems. There are some workarounds suggested. In this article by Dr. Jay Krishnaswamy, data transfer to MySQL using SQL Server Integration Services will be described. If you are new to SQL Server Integration Services (SSIS) you may want to read a book by the same author on Beginners Guide to SQL Server Integration Services Using Visual Studio 2005, published by Packt. Connectivity with MySQL For data interchange with MySQL there are two options one of which can be accessed in the connection wizards of SQL Server Integration Services assuming you have installed the programs. The other can be used to set up a ODBC DSN as described further down. The two connection options are: MySQL Connector/ODBC 5.1 Connector/Net 5.2 New versions 6.0 & 6.1 In this article we will be using the ODBC connector for MySQL which can be downloaded from the MySQL Site. The connector will be used to create an ODBC DSN. Transferring a table from SQL Server 2008 to MySQL We will transfer a table in the TestNorthwind database on SQL Server 2008 (Enterprise & Evaluation) to MySQL server database. The MySQL database we are using is described in the article on Exporting data from MS Access 2003 to MySQL. In another article, MySQL Linked Server on SQL Server 2008, creating an ODBC DSN for MySQL was described. We will be using the DSN created in that article. Creating an Integration Services project in Visual Studio 2008 Start the Visual Studio 2008 program from its shortcut. Click File | New | Project... to open the New Project window and select an integration services template from the business intelligence projects by providing a suitable name. The project folder will have a file called Package.dtsx which can be renamed with a custom name. Add and configure an ADO.NET Source The Project's package designer will be open displaying the Control Flow tab. Drag and drop a Data Flow Task on to the control flow tabbed page. Click next on the Data Flow tab in the designer to display the Data Flow page. Read the instructions on this page. Drag and drop a ADO.NET Source from the Data Flow Sources items in the Toolbox. It is assumed that you can set up a connection manager to the resident SQL Server 2008 on your machine. The next figure shows the configured connection manager to the SQL Server 2008. The table (PrincetonTemp) that will be transferred is in the TestNorthwind database. The authentication is Windows and a .NET provider is used to access the data. You may also test the connection by clicking the Test Connection button. If the connection shown above is correctly configured, the test should indicate a successful connection. Right click the ADO.NET source and from the drop-down click Edit. The ADO.NET Source Editor gets displayed. As mentioned earlier you should be able to access the table and view objects on the database as shown in the next figure. We have chosen to transfer a simple table, PrincetonTemp from the TestNorthwind database on SQL Server 2008. It has a only couple of columns as shown in the Columns page of the ADO.NET Source Editor. The default for the Error page setting has been assumed, that is, if there is an error or truncation of data the task will fail. Add an ADO.NET destination and port the data from the source Drag and drop an ADO.NET destination item from under Data Flow Destinations items in the Toolbox on to the data flow page of the designer. There are two ways to arrange for the data to flow from source to the destination. The easy way is just drag the green dangling line from the source with your mouse and let go on the ADO.NET destination. A solid line will connect the source and the destination as shown. (For more resources on Microsoft, see here.) Configure a connection manager to connect to MySQL In the Connection Manager's pane under the Package designer right click to display a pop-up menu which allows you to make a new connection. When you agree to make a new ADO.NET Connection the Configure ADO.NET connection Manager's window shows up and click on New... button on this page. The connection manager's page gets displayed as shown. In the Providers drop-down you will see a number of providers. There are the two providers that you can use, the ODBC through the connector and the MySQL Data Provider. Click on the Odbc Data Provider. As mentioned previously we will be using the System DSN MySQL_Link created earlier for the other article shown in the drop-down list of available ODBC DSN's. Provide the USERID and Password; click the Test Connection button. If all the information is correct you should get a success message as shown. Close out of the message as well as the Configure ADO.NET Connection Manager windows. Right click the ADO.NET Destination to display its editor window. In the drop-down for connection manager you should be able to pick the connection Manager you created in the previous step (MySQL_INK.root) as shown. Click on the New... button to create a Table or View. You will get a warning message regarding not knowing the mapping to SSIS as shown. Click OK. The create table window gets displayed as shown. Notice that the table is displaying all the columns from the table that the source is sending out. If you were to click OK, you would get an error that the syntax is not correct as shown. Modify the table as shown to change the destination table name (your choice) and the data type. CREATE TABLE From2k8( "Id" INT, "Month" VARCHAR(10), "Temperature" DOUBLE PRECISION, "RecordHigh" DOUBLE PRECISION ) Click OK. Again you get the same error regarding syntax not being correct. Modify the Create Table statement further as shown. CREATE TABLE From2k8 ( Id INT, Month VARCHAR(10), Temperature DOUBLE PRECISION, RecordHigh DOUBLE PRECISION ) Click OK after the above modification. The table gets added to the ADO.NET Destination Manager Editor as shown. Click on the Mappings on the left side of the ADO.NET Destination Editor. The column mappings page gets displayed as shown. We accept the default settings for Error Output page. Click OK. Build the project and execute the package by right clicking the package and choosing Execute Package. The program runs and processes the package and ends up being unsuccessful with the error message in the Progress tab of the project as shown (only relevant message is shown here). .... ..... [SSIS.Pipeline] Information: Execute phase is beginning. [ADO NET Destination 1 [165]] Error: An exception has occurred during data insertion, the message returned from the provider is: ERROR [42000] [MySQL][ODBC 5.1 Driver] [mysqld-5.1.30-community]You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '"Id", "Month", "Temperature", "RecordHigh") VALUES (1, 'Jan ', 4.000000000' at line 1 [SSIS.Pipeline] Error: SSIS Error Code DTS_E_PROCESSINPUTFAILED. The ProcessInput method on component "ADO NET Destination 1" (165) failed with error code 0xC020844B while processing input "ADO NET Destination Input" (168). The identified component returned an error from the ProcessInput method. The error is specific to the component, but the error is fatal and will cause the Data Flow task to stop running. There may be error messages posted before this with more information about the failure. [SSIS.Pipeline] Information: Post Execute phase is beginning. ...... .... Task Data Flow Task failed .... Start the MySQL Server and login to it. Run the following commands as shown in the next figure. By setting the mode to 'ANSI' makes the syntax more standard like as MySQL can cater to clients using other SQL modes. This is why the above error is returned although the syntax itself appears correct. In fact a create statement run on command line to create a table directly on MySQL could not create a table and returned an error when SSIS was used to create the same table. After running the above statements, build the BI project and execute the package. This time the execution is will be successful and you can query the MySQL Server as in the following: Summary The article describes step by step transferring a table from SQL Server 2008 to MySQL using ODBC connectivity. For successful transfer, the data type differences between SQL Server 2008 and the MySQL version must be properly taken into consideration as well as correctly setting the SQL_Mode property of MySQL Server. Further resources on this subject: Easy guide to understand WCF in Visual Studio 2008 SP1 and Visual Studio 2010 Express MySQL Linked Server on SQL Server 2008 Displaying MySQL data on an ASP.NET Web Page Exporting data from MS Access 2003 to MySQL Transferring Data from MS Access 2003 to SQL Server 2008
Read more
  • 0
  • 0
  • 20188
Visually different images

article-image-creating-external-tables-oracle-10g11g-database
Packt Editorial Staff
07 Jun 2009
16 min read
Save for later

Creating External Tables in your Oracle 10g/11g Database

Packt Editorial Staff
07 Jun 2009
16 min read
In this two-part article by Hector R. Madrid, we will learn about the External Tables in Oracle 10g/11g Database. When working in data warehouse environments, the Extraction—Transformation—Loading (ETL) cycle frequently requires the user to load information from external sources in plain file format, or perform data transfers among Oracle database in a proprietary format. This requires the user to create control files to perform the load. As the format of the source data regularly doesn't fit the one required by the Data Warehouse, a common practice is to create stage tables that load data into the database and create several queries that perform the transformation from this point on, to take the data to its final destination. A better approach would be to perform this transformation 'on the fly' at load time. That is what External Tables are for. They are basically external files, managed either by means of the SQL*Loader or the data pump driver, which from the database perspective can be seen as if they were regular read-only tables. This format allows the user to think about the data source as if the data was already loaded into its stage table. This lets the user concentrate on the queries to perform the transformation, thus saving precious time during the load phase. The basics of an External Tables in Oracle10g/11g An External Table is basically a file that resides on the server side, as a regular flat file or as a data pump formatted file. The External Table is not a table itself; it is an external file with an Oracle format and its physical location. This feature first appeared back in Oracle 9i Release 1 and it was intended as a way of enhancing the ETL process by reading an external flat file as if it was a regular Oracle table. On its initial release it was only possible to create read-only External Tables, but, starting with 10g—it is possible to unload data to External Tables too. In previous 10g Releases, there was only the SQL*Loader driver could be used to read the External Table, but from 10g onwards it is now possible to load the table by means of the data pump driver. The kind of driver that will be used to read the External Table is defined at creation time. In the case of ORACLE_LOADER it is the same driver used by SQL*Loader. The flat files are loaded in the same way that a flat file is loaded to the database by means of the SQL*Loader utility, and the creation script can be created based on a SQL*Loader control file. In fact, most of the keywords that are valid for data loading are also valid to read an external flat file table. The main differences between SQL*Loader and External Tables are: When there are several input datafiles SQL*Loader will generate a bad file and a discard file for each datafile. The CONTINUEIF and CONCATENATE keywords are not supported by External Tables. The GRAPHIC, GRAPHIC EXTERNAL, and VARGRAPHIC are not supported for External Tables. LONG, nested tables, VARRAY, REF, primary key REF, and SID are not supported. For fields in External Tables the character set, decimal separator, date mask and other locale settings are determined by the database NLS settings. The use of the backslash character is allowed for SQL*Loader, but for External Tables this would raise an error. External Tables must use quotation marks instead.For example: SQL*Loader FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY "" External Tables TERMINATED BY ',' ENCLOSED BY "'" A second driver is available, the ORACLE_DATAPUMP access driver, which uses the Data Pump technology to read the table and unload data to an External Table. This driver allows the user to perform a logical backup that can later be read back to the database without actually loading the data. The ORACLE_DATAPUMP access driver utilizes a proprietary binary format for the external file, so it is not possible to view it as a flat file. Let's setup the environment Let's create the demonstration user, and prepare its environment to create an External Table. The example that will be developed first refers to the External Table using the ORACLE_LOADER driver. create user EXTTABDEMO identified by ORACLE default tablespace USERS; alter user exttabdemo quota unlimited on users; grant CREATE SESSION, CREATE TABLE, CREATE PROCEDURE, CREATE MATERIALIZED VIEW, ALTER SESSION, CREATE VIEW, CREATE ANY DIRECTORY to EXTTABDEMO; A simple formatted spool from this query will generate the required external table demonstration data. The original source table is the demonstration HR.EMPLOYEES table. select EMPLOYEE_ID ||','|| DEPARTMENT_ID ||','|| FIRST_NAME ||','|| LAST_NAME ||','|| PHONE_NUMBER ||','|| HIRE_DATE ||','|| JOB_ID ||','|| SALARY ||','|| COMMISSION_PCT ||','|| MANAGER_ID ||','|| EMAIL from HR.EMPLOYEES order by EMPLOYEE_ID The above query will produce the following sample data: The External Table directory is defined inside the database by means of a DIRECTORY object. This object is not validated at creation time, so the user must make sure the physical directory exists and the oracle OS user has read/write privileges on it. $ mkdir $HOME/external_table_dest SQL> CREATE DIRECTORY EXTTABDIR AS '/home/oracle/external_table_dest'; The above example was developed in a Linux environment, on Windows platforms the paths will need to be changed to correctly reflect how Oracle has been set up. Now, the first External Table can be created. A basic External Table Here is the source code of the External Table creation. The create table command syntax is just like any other regular table creation (A), (B), up to the point where the ORGANIZATION EXTERNAL (C) keyword appears, this is the point where the actual External Table definition starts. In this case the External Table is accessed by the ORACLE_LOADER driver (D). Next, the external flat file is defined, and here it is declared the Oracle DIRECTORY (E) where the flat file resides. The ACCESS PARAMETERS(F) section specifies how to access the flat file and it declares whether the file is a fixed or variable size record, and which other SQL*Loader loading options are declared. The LOCATION (H) keyword defines the name of the external data file. It must be pointed out that as this is an External Table managed by the SQL_LOADER driver the ACCESS_PARAMETERS section must be defined, in the case of External Tables based on the DATAPUMP_DRIVER this section is not required. The columns are defined only by name (G), not by type. This is permitted from the SQL*Loader perspective, and allows for dynamic column definition. This column schema definition is more flexible, but it has a drawback—data formats such as those in DATEcolumns must match the database date format, otherwise the row will be rejected. There are ways to define the date format working around this requirement. Assuming the date column changes from its original default format mask "DD-MON-RR" to "DD-MM-RR", then the column definition must change from a simple CHAR column to a DATE with format mask column definition. Original format: "HIRE_DATE" CHAR(255) Changed format: "HIRE_DATE" DATE "DD-MM-RR" When working with an External Table, the access parameter is not validated at creation time, so if there are malformed rows, or if there are improperly defined access parameters, an error is shown, similar to the one below. When working with an External Table, the access parameter is not validated at creation time, so if there are malformed rows, or if there are improperly defined access parameters, an error is shown, similar to the one below. ERROR at line 1: ORA-29913: error in executing ODCIEXTTABLEFETCH callout ORA-30653: reject limit reached ORA-06512: at "SYS.ORACLE_LOADER", line 52 Once the data is created and all required OS privileges have been properly validated, the data can be seen from inside the database, just as if it were a regular Oracle table. This table is read only, so if the user attempts to perform any DML operation against it, it will result in this error: SQL> delete ext_employees; delete ext_employees * ERROR at line 1: ORA-30657: operation not supported on external organized table As the error message clearly states, this kind of table is only useful for read only operations. This kind of table doesn't support most of the operations available for regular tables, such as index creation, and statistics gathering, and these types of operations will cause an ORA-30657 error too. The only access method available for External Tables is Full Table Scan, so there is no way to perform a selective data retrieval operation. The External Tables cannot be recovered, they are just metadata definitions stored in the dictionary tables. The actual data resides in external files, and there is no way to protect them with the regular backup database routines, so it is the user's sole responsibility to provide proper backup and data management procedures. At the database level the only kind of protection the External Table receives is at the metadata level, as it is an object stored as a definition at the database dictionary level. As the data resides in the external data file, if by any means it were to be corrupted, altered, or somehow modified, there would be no way to get back the original data. If the external data file is lost, then this may go unnoticed, until the next SELECT operation takes place. This metadata for an External Table is recorded at the {USER | ALL | DBA}_TABLES view, and as this table doesn't actually require physical database storage, all storage related columns appear as null, as well as the columns that relate to the statistical information. This table is described with the {USER | ALL | DBA}_EXTERNAL_TABLES view, where information such as the kind of driver access, the reject_limit, and the access_parameters, amongst others, are described. SQL> DESC USER_EXTERNAL_TABLES Name Null? Type ------------------------------- -------- -------------- TABLE_NAME NOT NULL VARCHAR2(30) TYPE_OWNER CHAR(3) TYPE_NAME NOT NULL VARCHAR2(30) DEFAULT_DIRECTORY_OWNER CHAR(3) DEFAULT_DIRECTORY_NAME NOT NULL VARCHAR2(30) REJECT_LIMIT VARCHAR2(40) ACCESS_TYPE VARCHAR2(7) ACCESS_PARAMETERS VARCHAR2(4000) PROPERTY VARCHAR2(10) This is the first basic External Table, and as previously shown, its creation is pretty simple. It allows external data to be easily accessed from inside the database, allowing the user to see the external data just as if it was already loaded inside a regular stage table. Creating External Table metadata, the easy way To further illustrate the tight relationship between SQL*Loader and External Tables, the SQL*Loader tool may be used to generate a script that creates an External Table according to a pre-existing control file. SQL*Loader has a command line option named EXTERNAL_TABLE, this can hold one of three different parameters {NOT_USED | GENERATE_ONLY | EXECUTE}. If nothing is set, it defaults to the NOT_USED option. This keyword is used to generate the script to create an External Table, and the options mean: NOT_USED: This is the default option, and it means that no External Tables are to be used in this load. GENERATE_ONLY: If this option is specified, then SQL*Loader will only read the definitions from the control file and generate the required commands, so the user can record them for later execution, or tailor them to fit his/her particular needs. EXECUTE: This not only generates the External Table scripts, but also executes them. If the user requires a sequence, then the EXECUTE option will not only create the table, but it will also create the required sequence, deleting it once the data load is finished. This option performs the data load process against the specified target regular by means of an External Table, it creates both the directory and the External Table, and inserts the data using a SELECT AS INSERTwith the APPEND hint. Let's use the GENERATE_ONLY option to generate the External Table creation scripts: $ sqlldr exttabdemo/oracle employees external_table=GENERATE_ONLY By default the log file is located in a file whose extension is .log and its name equals that of the control file. By opening it we see, among the whole log processing information, this set of DDL commands: CREATE TABLE "SYS_SQLLDR_X_EXT_EMPLOYEES" ( "EMPLOYEE_ID" NUMBER(6), "FIRST_NAME" VARCHAR2(20), "LAST_NAME" VARCHAR2(25), "EMAIL" VARCHAR2(25), "PHONE_NUMBER" VARCHAR2(20), "HIRE_DATE" DATE, "JOB_ID" VARCHAR2(10), "SALARY" NUMBER(8,2), "COMMISSION_PCT" NUMBER(2,2), "MANAGER_ID" NUMBER(6), "DEPARTMENT_ID" NUMBER(4) ) ORGANIZATION external ( TYPE oracle_loader DEFAULT DIRECTORY EXTTABDIR ACCESS PARAMETERS ( RECORDS DELIMITED BY NEWLINE CHARACTERSET US7ASCII BADFILE 'EXTTABDIR':'employees.bad' LOGFILE 'employees.log_xt' READSIZE 1048576 FIELDS TERMINATED BY "," OPTIONALLY ENCLOSED BY '"' LDRTRIM REJECT ROWS WITH ALL NULL FIELDS ( "EMPLOYEE_ID" CHAR(255) TERMINATED BY "," OPTIONALLY ENCLOSED BY '"', "FIRST_NAME" CHAR(255) TERMINATED BY "," OPTIONALLY ENCLOSED BY '"', "LAST_NAME" CHAR(255) TERMINATED BY "," OPTIONALLY ENCLOSED BY '"', "EMAIL" CHAR(255) TERMINATED BY "," OPTIONALLY ENCLOSED BY '"', "PHONE_NUMBER" CHAR(255) TERMINATED BY "," OPTIONALLY ENCLOSED BY '"', "HIRE_DATE" CHAR(255) TERMINATED BY "," OPTIONALLY ENCLOSED BY '"', "JOB_ID" CHAR(255) TERMINATED BY "," OPTIONALLY ENCLOSED BY '"', "SALARY" CHAR(255) TERMINATED BY "," OPTIONALLY ENCLOSED BY '"', "COMMISSION_PCT" CHAR(255) TERMINATED BY "," OPTIONALLY ENCLOSED BY '"', "MANAGER_ID" CHAR(255) TERMINATED BY "," OPTIONALLY ENCLOSED BY '"', "DEPARTMENT_ID" CHAR(255) TERMINATED BY "," OPTIONALLY ENCLOSED BY '"' ) ) location ( 'employees.txt' ) ) The more complete version is shown, some differences with the basic script are: All the column definitions are set to CHAR(255) with the delimiter character defined for each column If the current working directory is already registered as a regular DIRECTORY at the database level, SQL*Loader utilizes it, otherwise, it creates a new directory definition The script specifies where the bad files and log file are located It specifies that an all-null column record is rejected The more complete version is shown, some differences with the basic script are: All the column definitions are set to CHAR(255) with the delimiter character defined for each column If the current working directory is already registered as a regular DIRECTORY at the database level, SQL*Loader utilizes it, otherwise, it creates a new directory definition The script specifies where the bad files and log file are located It specifies that an all-null column record is rejected In the case of the EXECUTE keyword, the log file shows that not only are the scripts used to create the External Table, but also to execute the INSERT statement with the /*+ append */hint. The load is performed in direct path mode. All External Tables, when accessed, generate a log file. In the case of the ORACLE_LOADERdriver, this file is similar to the file generated by SQL*Loader. It has a different format in the case of ORACLE_DATAPUMP driver. The log file is generated in the same location where the external file resides, and its format is as follows: <EXTERNAL_TABLE_NAME>_<OraclePID>.log When an ORACLE_LOADER managed External Table has errors, it dumps the 'bad' rows to the *.bad file, just the same as if this was loaded by SQL*Loader. The ORACLE_DATAPUMP External Table generates a simpler log file, it only contains the time stamp when the External Table was accessed, and it creates a log file for each oracle process accessing the External Table. Unloading data to External Tables The driver used to unload data to an External Table is the ORACLE_DATAPUMP access driver. It dumps the contents of a table in a binary proprietary format file. This way you can exchange data with other 10g and higher databases in a preformatted way to meet the other database's requirements. Unloading data to an External Table doesn't make it updateable, the tables are still limited to being read only. Let's unload the EMPLOYEES table to an External Table: create table dp_employees organization external( type oracle_datapump default directory EXTTABDIR location ('dp_employees.dmp') ) as select * from employees; This creates a table named DP_EMPLOYEES, located at the specified EXTTABDIR directory and with a defined OS file name. In the next example, at a different database a new DP_EMPLOYEES table is created, this table uses the already unloaded data by the first database. This DP_EMPLOYEES External Table is created on the 11g database side. create table dp_employees( EMPLOYEE_ID NUMBER(6), FIRST_NAME VARCHAR2(20), LAST_NAME VARCHAR2(25), EMAIL VARCHAR2(25), PHONE_NUMBER VARCHAR2(20), HIRE_DATE DATE, JOB_ID VARCHAR2(10), SALARY NUMBER(8,2), COMMISSION_PCT NUMBER(2,2), MANAGER_ID NUMBER(6), DEPARTMENT_ID NUMBER(4) ) organization external ( type oracle_datapump default directory EXTTABDIR location ('dp_employees.dmp') ); This table can already read in the unloaded data from the first database. The second database is a regular 11g database. So this shows the inter-version upward compatibility between a 10g and an 11g database. SQL> select count(*) from dp_employees; COUNT(*) ---------- 107 Inter-version compatibility In, the previous example a 10g data pump generated an External Table that was transparently read by the 11g release. Let's create an 11g data pump External Table named DP_DEPARTMENTS: create table dp_departments organization external( type oracle_datapump default directory EXTTABDIR access parameters ( version '10.2.0' ) location ('dp_departments.dmp') ) as select * from departments Table created. SQL> select count(*) from dp_departments; COUNT(*) ---------- 27 In the previous example, it is important to point out that the VERSION keyword defines the compatibility format. access parameters ( version '10.2.0' ) If this clause is not specified then an incompatibility error will be displayed. SQL> select count(*) from dp_departments; select count(*) from dp_departments * ERROR at line 1: ORA-29913: error in executing ODCIEXTTABLEOPEN callout ORA-39142: incompatible version number 2.1 in dump file "/home/oracle/external_table_dest/dp_departments.dmp" ORA-06512: at "SYS.ORACLE_DATAPUMP", line 19 Now let's use the 10g version to read from it. SQL> select count(*) from dp_departments; COUNT(*) ---------- 27 The VERSION clause is interpreted the same way as the VERSION clause for the data pump export, it has three different values: COMPATIBLE: This states that the version of the metadata corresponds to the database compatibility level. LATEST: This corresponds to the database version. VERSION NUMBER: This refers to a specific oracle version that the file is compatible with. This value cannot be lower than 9.2.0 Summary As we can see, External Tables can serve not only as improvements to the ETL process, but also as a means to manage database environments, and a means of reducing the complexity level of data management from the user's point of view. In the next part, we will see how External Tables can be used for data transformation and the enhancements Oracle 11g has brought about in External Tables.
Read more
  • 0
  • 0
  • 12505
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at $15.99/month. Cancel anytime