资源预览内容
第1页 / 共28页
第2页 / 共28页
第3页 / 共28页
第4页 / 共28页
第5页 / 共28页
第6页 / 共28页
第7页 / 共28页
第8页 / 共28页
第9页 / 共28页
第10页 / 共28页
亲,该文档总共28页,到这儿已超出免费预览范围,如果喜欢就下载吧!
资源描述
Transportation: Loading Warehouse Data,Overview,Objectives,After completing this lesson, you should be able to do the following: Explain key concepts in transporting data into the warehouse Outline how to build the transportation process for first time load Identify transportation techniques Identify the tasks that take place after data is loaded Explain the issues involved in designing the transportation, loading, and scheduling processes,Transporting Data into the Warehouse,Loading moves the data into the warehouse Loading can be time-consuming: Consider the load window Schedule the task; automate all processes Initial load moves large volumes Subsequent refresh moves smaller volumes Business determines the cycle,Extract,Transform,Transport (load),Warehouse,Extract Processing Environment,After each time interval, build a new database Run queries,Warehouse Processing Environment,Build a new database After each time interval, add changes to database Archive or purge oldest data Run queries,T1,T2,T3,Operational databases,First-Time Load,Single event that populates the database with historical data Involves large volume of data Employs distinct ETT tasks Involves large amounts of processing after load,T1,T2,T3,Operational databases,Refresh,Performed according to a business cycle Simpler task Less data to load than first-time load Less-complex ETT Smaller amounts of post load processing,T1,T2,T3,Operational databases,Building the Transportation Process,Specification Techniques and tools File transfer methods The load window Time window for other tasks First-time and refresh volumes Frequency of the refresh cycle Connectivity bandwidth,Building the Transportation Process,Test the proposed technique Document proposed load Gain agreement on the process Monitor Review Revise,Granularity,Important design and operational issue Space requirements Storage Backup Recovery Partitioning Load,Low-level grainExpensive, high level of processing, more disk, detailHigh-level grain Cheaper, less processing, less disk, little detail,Transportation Techniques,Tools Utilities and 3GL Gateways Customized copy programs Replication FTP Manual,Transportation Technique Considerations,Tools are comprehensive but costly. Data-movement utilities are fast and powerful. Gateways are not always the fastest method: Access other databases Supply dependent data marts Support a distributed environment Provide real-time access if needed,Using SQL*Loader to Load Data,SQL*Loader,Control file,Bad files,Log files,Discard files,Fastest load mechanism Direct path Parallel and unrecoverable Direct-load INSERT (Oracle8) Direct-path load API (Oracle8i),Direct-Path Load API in Oracle8i,Load utility,Allows ETT and other tools to load Oracle databases efficiently Permits load behavior to be customized Gives direct-path load performance Provides complete access to all direct-load functionality using OCI,More Transportation Technique Considerations,Use customized programs as a last resort. Replication is limited by data-transfer rates.,Post processing of Loaded Data,Extract,Transform,Transport,Loaded data,Postprocessing of loaded data,Create indexes,Generate keys,Summarize,Filter,Indexing Data,Before load: fast index reenablement During load: adds time to load window After load: adds time to load window,Index,Unique Indexes,Disable constraints to load Enable constraints to create index,Creating Artificial Keys,Use generalized or derived keys Maintain the uniqueness of a row Use an administrative process to assign the key Concatenate operational key with number: Easy to maintain Cumbersome keys No clean value for retrieval,109908,10990801,Creating Unique Keys for Records,Assign a number from a list: No semantic meaning Extract operations must reference table to assign numbers Update metadata Verdict,109908,1,Creating Summary Tables,CTAS pCTAS,Summary data,Filtering Data,From warehouse to data marts CTAS pCTAS,Summary data,Warehouse,Data marts,Verifying Data Integrity,Load data into intermediate file Compare target flash totals with totals before load,Load,File 1,File 2,Flash totals,Counts and amounts,Intermediate file,=,!=,File 2,File 1,Load,Warehouse,Preserve, inspect, fix, then load,Steps for Verifying Data Integrity,Control,Extract,SQL*Loader,.bad,1,2,3,4,5,6,7,.log,Standard Quality Assurance Checks,Load status Completion of the process Completeness of the data Data reconciliation Violations Reprocessing Comparison of counts and amounts,1,Summary,This lesson discussed the following topics: First-time load considerations Techniques for transporting data Tasks involved in the postload processing stage,Practice 12-1 Overview,This practice covers the following topics: Identifying a series of statements as true or false Answering a series of questions,
收藏 下载该资源
网站客服QQ:2055934822
金锄头文库版权所有
经营许可证:蜀ICP备13022795号 | 川公网安备 51140202000112号