Conventional and Direct Path Loads

Data Loading Methods

SQL*Loader provides two methods for loading data:

A conventional path load executes SQL INSERT statements to populate tables in an Oracle database. A direct path load eliminates much of the Oracle database overhead by formatting Oracle data blocks and writing the data blocks directly to the database files. A direct load does not compete with other users for database resources, so it can usually load data at near disk speed. Considerations inherent to direct path loads, such as restrictions, security, and backup implications, are discussed in this chapter.

The tables to be loaded must already exist in the database. SQL*Loader never creates tables. It loads existing tables that either already contain data or are empty.

The following privileges are required for a load:

  • You must have INSERT privileges on the table to be loaded.
  • You must have DELETE privileges on the table to be loaded, when using the REPLACE or TRUNCATE option to empty old data from the table before loading the new data in its place.

Figure 11-1 shows how conventional and direct path loads perform database writes.

Figure 11-1 Database Writes on SQL*Loader Direct Path and Conventional PathDescription of Figure 11-1 follows
Description of “Figure 11-1 Database Writes on SQL*Loader Direct Path and Conventional Path”

Loading ROWID Columns

In both conventional path and direct path, you can specify a text value for a ROWID column. (This is the same text you get when you perform a SELECT ROWID FROM table_name operation.) The character string interpretation of the ROWID is converted into the ROWID type for a column in a table.

Conventional Path Load

Conventional path load (the default) uses the SQL INSERT statement and a bind array buffer to load data into database tables. This method is used by all Oracle tools and applications.

When SQL*Loader performs a conventional path load, it competes equally with all other processes for buffer resources. This can slow the load significantly. Extra overhead is added as SQL statements are generated, passed to Oracle, and executed.

The Oracle database looks for partially filled blocks and attempts to fill them on each insert. Although appropriate during normal use, this can slow bulk loads dramatically.

Conventional Path Load of a Single Partition

By definition, a conventional path load uses SQL INSERT statements. During a conventional path load of a single partition, SQL*Loader uses the partition-extended syntax of the INSERT statement, which has the following form:

INSERT INTO TABLE T PARTITION (P) VALUES ... 

The SQL layer of the Oracle kernel determines if the row being inserted maps to the specified partition. If the row does not map to the partition, the row is rejected, and the SQL*Loader log file records an appropriate error message.

When to Use a Conventional Path Load

If load speed is most important to you, you should use direct path load because it is faster than conventional path load. However, certain restrictions on direct path loads may require you to use a conventional path load. You should use a conventional path load in the following situations:

  • When accessing an indexed table concurrently with the load, or when applying inserts or updates to a nonindexed table concurrently with the loadTo use a direct path load (with the exception of parallel loads), SQL*Loader must have exclusive write access to the table and exclusive read/write access to any indexes.
  • When loading data into a clustered tableA direct path load does not support loading of clustered tables.
  • When loading a relatively small number of rows into a large indexed tableDuring a direct path load, the existing index is copied when it is merged with the new index keys. If the existing index is very large and the number of new keys is very small, then the index copy time can offset the time saved by a direct path load.
  • When loading a relatively small number of rows into a large table with referential and column-check integrity constraintsBecause these constraints cannot be applied to rows loaded on the direct path, they are disabled for the duration of the load. Then they are applied to the whole table when the load completes. The costs could outweigh the savings for a very large table and a small number of new rows.
  • When loading records and you want to ensure that a record is rejected under any of the following circumstances:
    • If the record, upon insertion, causes an Oracle error
    • If the record is formatted incorrectly, so that SQL*Loader cannot find field boundaries
    • If the record violates a constraint or tries to make a unique index non-unique

Direct Path Load

Instead of filling a bind array buffer and passing it to the Oracle database with a SQL INSERT statement, a direct path load uses the direct path API to pass the data to be loaded to the load engine in the server. The load engine builds a column array structure from the data passed to it.

The direct path load engine uses the column array structure to format Oracle data blocks and build index keys. The newly formatted database blocks are written directly to the database (multiple blocks per I/O request using asynchronous writes if the host platform supports asynchronous I/O).

Internally, multiple buffers are used for the formatted blocks. While one buffer is being filled, one or more buffers are being written if asynchronous I/O is available on the host platform. Overlapping computation with I/O increases load performance.

Data Conversion During Direct Path Loads

During a direct path load, data conversion occurs on the client side rather than on the server side. This means that NLS parameters in the initialization parameter file (server-side language handle) will not be used. To override this behavior, you can specify a format mask in the SQL*Loader control file that is equivalent to the setting of the NLS parameter in the initialization parameter file, or set the appropriate environment variable. For example, to specify a date format for a field, you can either set the date format in the SQL*Loader control file as shown in Example 11-1 or set an NLS_DATE_FORMAT environment variable as shown in Example 11-2.

Example 11-1 Setting the Date Format in the SQL*Loader Control File

LOAD DATA
INFILE 'data.dat'
INSERT INTO TABLE emp
FIELDS TERMINATED BY "|"
(
EMPNO NUMBER(4) NOT NULL,
ENAME CHAR(10),
JOB CHAR(9),
MGR NUMBER(4),
HIREDATE DATE 'YYYYMMDD',
SAL NUMBER(7,2),
COMM NUMBER(7,2),
DEPTNO NUMBER(2)
)

Example 11-2 Setting an NLS_DATE_FORMAT Environment Variable

On UNIX Bourne or Korn shell:

% NLS_DATE_FORMAT='YYYYMMDD'
% export NLS_DATE_FORMAT

On UNIX csh:

%setenv NLS_DATE_FORMAT='YYYYMMDD'

credit: https://docs.oracle.com/cd/B28359_01/server.111/b28319/ldr_modes.htm#i1008815