Working with Datapump ?

Let’s look at below in detail

Data Pump Best Practices
Dont Invoke expdp using SYS 
Purge recyclebin before Export , User/Table/DB Level
** PARALLELISM doesn't work with LOB COLUMN
How to use/estimate PARALLEL parameter in Datapump?
How to Check/Monitor DATAPUMP JOBS?
**Datapump will use two different load methods during import(impdp).

Data Pump Best Practices

pga_aggregate_target  -->Set this to high,it will improve the Data pump performance.
For export consistency use:-
FLASHBACK_TIME=SYSTIMESTAMP, This will increase UNDO requirements for the duration of the export

compression_algorithm=medium --12C Recommended option. Similar characteristics to BASIC, but uses a different algorithm

Always set parameters:-
LOGTIME=ALL -->Timestamps   (From 12C)

Speed up Data Pump:-
1. Initial impdp with EXCLUDE=INDEXES 
2. Second impdp with INCLUDE=INDEXES SQLFILE=indexes.sql 
3. Split indexes.sql into multiple SQL files and run in multiple sessions


Direct import via database link (Network bandwidth and CPU bound):-
– Parameter: NETWORK_LINK
Run only impdp on the target system - no expdp necessary
No dump file written, no disk I/O, no file transfer needed

Restrictions of database links apply: – Does not work with LONG/LONG RAW and certain object types
Performance: Depends on network bandwidth and target's CPUs

Some Commands /Use Cases

remap_tablespace=OLD_TBS:NEW_TBS ==>Move all objects from one tablespace to another
remap_schema=old_schema:new_schema ==> Move a object to a different schema
expdp with content=metadata_only & impdp with remap_schema=A:Z  ==> Clone a User
remap_datafile=’/u01/app/oracle/oradata/datafile_01.dbf’:’/u01/datafile_01.dbf’  ==> Create your database in a different file structure
transform=pctspace:70 ,sample=70 -->tell the Data Pump to reduce the size of extents to 70% in impdp
there is a database parameter FORCE LOGGING which overwrites this feature.

EXPDP Filesize : Split or Slice the Dump file into Multiple Directories
expdp srinalla/srinalla job_name=exp_job_multiple_dir  schemas=STHOMAS  filesize=3G dumpfile=datapump:expdp_datapump_%U.dmp,TESTING:expdp_testing_%U.dmp logfile=dump.log compression=all parallel=10

While import,mention like this

Statistics are imported by default
cluster  (Default=Y,From 11gR2,Parallelization in RAC, Can be on all nodes or only few nodes based on service_name=EBS_DP_12
         Sttaus Check :select inst_id, session_type from dba_datapump_sessions;
Commit the Import on every row with COMMIT=Y.

If COMMIT=Y, Import commits tables containing LONG, LOB, BFILE, ROWID, UROWID,
DATE or Type Columns after each row.
restart the job with a different degree of parallelism, say 4 (earlier it was 6):
Export> parallel=4
Export> continue_client --show progress

import using “table_exists_action=replace” and TABLES=(list of skipped tables)

nohup impdp system/secret NETWORK_LINK=olddb FULL=y  PARALLEL=25 &
impdp system attach
Import> status
Import> parallel=30 << this will increase the parallel processes if you want

Do not invoke expdp using ‘/ as sysdba’

Also, do not invoke expdp using ‘/ as sysdba’ – use the SYSTEM account – see the first Note section here

Purge recyclebin before Export , User/Table/DB Level
select object_name, original_name, type, can_undrop as “UND”, can_purge as “PUR”, droptime from dba_recyclebin
where owner = ‘XX_DUMMY’;
purge table “BIN$HGnc55/7rRPgQPeM/qQoRw==$0” ;

** PARALLELISM doesn’t work with LOB COLUMN
parallelism doesn’t work ,because data pump serializes the dump when it comes to a LOB table.
The Approach should be like this
1) the whole database/schema minus LOB table and
2) the LOB table.
** pga_aggregate_target proved to be the most important change in the overall scheme of things
because indexes were built towards the end of the job and took 3 times longer
than actually creating the tables and importing the data in this test.
Check LOB Columns with below Query

SELECT  s.tablespace_name ,l.owner,l.table_name,l.column_name,l.segment_name,s.segment_type, round(s.bytes/1024/1024/1024,2) "Size(GB)"
FROM DBA_SEGMENTS s,dba_lobs l
where l.owner = s.owner and l.segment_name = s.segment_name
and l.owner not in ('SYS','SYSTEM','APPS','APPLSYS')
--and round(s.bytes/1024/1024/1024,2)>1
order by s.bytes desc;

Check below links how to fix the issue
Master Note: Overview of Oracle Large Objects (BasicFiles LOBs and SecureFiles LOBs) (Doc ID 1490228.1)

How to use/estimate PARALLEL parameter in Datapump?

Before starting any export/import, it is better to use ESTIMATE_ONLY parameter. Divide the output by 250MB and based on the result decide on PARALLEL value
Finally when using PARALLEL option, do keep below points in mind
a. Set the degree of parallelism 2*no of CPU, then tune from there.
b. For Data Pump Export, the PARALLEL parameter value < dumpfiles
c. For Data Pump Import, the PARALLEL parameter value  < dumpfiles
For more details, you can refer to MOS doc 365459.1

How to Check/Monitor DATAPUMP JOBS?

Monitoring Data Pump
Queries to Monitor Datapump Jobs
How to delete/remove non executing datapump jobs?

Datapump will use two different load methods during import(impdp)

  1. Direct load path – this is the main reason why datapump import (impdp) is faster than traditional import (imp)
  2. external table path
    But datapump cannot use direct path always due to some restrictions and because of this reason, sometimes you may observe impdp run slower than expected.
    Now, what are those situations when datapump will not use direct path? If a table exist with
    1. A global index on multipartition tables exists during a single-partition load. This includes object tables that are partitioned.
    2. A domain index exists for a LOB column.
    3. A table is in a cluster.
    4. There is an active trigger on a pre-existing table.
    5. Fine-grained access control is enabled in insert mode on a pre-existing table.
    6. A table contains BFILE columns or columns of opaque types.
    7. A referential integrity constraint is present on a pre-existing table.
    8. A table contains VARRAY columns with an embedded opaque type.
    9. The table has encrypted columns
    10. The table into which data is being imported is a pre-existing table and at least one of the following conditions exists:
    – There is an active trigger
    – The table is partitioned
    – A referential integrity constraint exists
    – A unique index exists
    11. Supplemental logging is enabled and the table has at least 1 LOB column.
    Note: Data Pump will not load tables with disabled unique indexes. If the data needs to be loaded into the table, the indexes must be either dropped or re-enabled.

Master Note for Data Pump:MOS Note:1264715.1
For Compatibility and version changes:MOS Note:553337.
Using Oracle’s recycle bin
Master Note: Overview of Oracle Large Objects (BasicFiles LOBs and SecureFiles LOBs) (Doc ID 1490228.1)
Checklist For Slow Performance Of DataPump Export (expdp) And Import (impdp) (Doc ID 453895.1) ***************
How to use PARALLEL parameter in Datapump?