资源预览内容
第1页 / 共225页
第2页 / 共225页
第3页 / 共225页
第4页 / 共225页
第5页 / 共225页
第6页 / 共225页
第7页 / 共225页
第8页 / 共225页
第9页 / 共225页
第10页 / 共225页
亲,该文档总共225页,到这儿已超出免费预览范围,如果喜欢就下载吧!
资源描述
DatabasesforBioinformatics陈艳炯医学院免疫与病原生物学系数据库系统基础数据库系统基础数据库的基本概念数据管理系统的发展数据库技术的发展数据库系统的组成数据库应用系统体系结构数据数据(Data)数据的定义数据的定义 描述客观事物描述客观事物( (对象对象) )的符号记录的符号记录数据的种类数据的种类 文字、图形、图像、声音文字、图形、图像、声音 数据的特点数据的特点 数据与其语义是不可分的数据与其语义是不可分的DataThetermdatameansgroupsofinformationthatrepresentthequalitativeorquantitativeattributesofavariableorsetofvariables.Data(pluralofdatum,whichisseldomused)aretypicallytheresultsofmeasurementsandcanbethebasisofgraphs,images,orobservationsofasetofvariables.Dataareoftenviewedasthelowestlevelofabstractionfromwhichinformationandknowledgearederived.数据概念的变化特点数据概念的变化特点质的规定:由简单到集成;由私有到共享。质的规定:由简单到集成;由私有到共享。量的刻化:由量的刻化:由小量小量到到大量大量到到海量海量。所处位置:在软件中的从属地位到主导地位。所处位置:在软件中的从属地位到主导地位。信息信息(Information) 是以数据为载体的对客是以数据为载体的对客观世界实际存在的事物、事件和概念的抽象观世界实际存在的事物、事件和概念的抽象反应。反应。信息信息=数据数据+数据处理数据处理 DataprocessingComputer data processingisanyprocessthatusesacomputerprogramtoenterdataandsummarise,analyseorotherwiseconvertdataintousableinformation.Theprocessmaybeautomatedandrunonacomputer.Itinvolvesrecording,analysing,sorting,summarising,calculating,disseminatingandstoringdata.Becausedataaremostusefulwhenwell-presentedandactuallyinformative,data-processingsystemsareoftenreferredtoasinformationsystems.Data analysisWhenthedomainfromwhichthedataareharvestedisascienceoranengineering,dataprocessingandinformationsystemsareconsideredtoobroadoftermsandthemorespecializedtermdataanalysisistypicallyused,focusingonthehighly-specializedandhighly-accuratealgorithmicderivationsandstatisticalcalculationsthatarelessoftenobservedinthetypicalgeneralbusinessenvironment.DataanalysispackageslikeDAP,gretlorPSPPareoftenused.ElementsofdataprocessingInordertobeprocessedbyacomputer,dataneedsfirstbeconvertedintoamachinereadableformat.Oncedataisindigitalformat,variousprocedurescanbeappliedonthedatatogetusefulinformation.Dataprocessingmayinvolvevariousprocesses,including:Dataacquisition(数据采集)Dataentry(数据录入)Datacleaning(数据清理)Datavalidation(数据验证)Datatabulation(数据制表)Statisticalanalysis(统计分析)Computergraphics(计算机图形)Datawarehousing(数据存储)Datamining(数据挖掘)DataacquisitionIncomputerdataprocessing,data acquisitionisthesamplingofrealworldphysicalconditionsandconversionoftheresultingsamplesintodigitalnumericvaluesthatcanbemanipulatedbyacomputer.Thecomponentsofdataacquisitionsystemsinclude:Sensorsthatconvertphysicalparameterstoelectricalsignals.Signalconditioningcircuitrytocoercesensorsignalsintoaformthatcanbeconvertedtodigitalvalues.Analog-to-digitalconverters,whichconvertconditionedsensorsignalstodigitalvalues.Dependingontheapplication,acquireddatamaybedisplayed,analyzed,orrecorded,orsomecombinationthereof.DataacquisitionapplicationsmaybecontrolledbycommercialDAQsoftwareorbycustomprogramsdevelopedusingvariousgeneralpurposeprogramminglanguagessuchasBASICorC.SpecializedprogramminglanguagesusedfordataacquisitionincludeEPICSforbuildinglargescaledataacquisitionsystems,LabVIEW,whichoffersagraphicalprogrammingenvironment,andMATLABwhichprovidesgraphicaltoolsandlibrariesfordataacquisitionandanalysis.Data cleansingordata scrubbingistheactofdetectingandcorrecting(orremoving)corruptorinaccuraterecordsfromarecordset,table,ordatabase.Usedmainlyindatabases,thetermreferstoidentifyingincomplete,incorrect,inaccurate,irrelevantetc.partsofthedataandthenreplacing,modifyingordeletingthisdirty data.Aftercleansing,adatasetwillbeconsistentwithothersimilardatasetsinthesystem.Theinconsistenciesdetectedorremovedmayhavebeenoriginallycausedbydifferentdatadictionarydefinitionsofsimilarentitiesindifferentstores,mayhavebeencausedbyuserentryerrors,ormayhavebeencorruptedintransmissionorstorage.Datacleansingdiffersfromdatavalidationinthatvalidationalmostinvariablymeansdataisrejectedfromthesystematentryandisperformedatentrytime,ratherthanonbatchesofdata.Theactualprocessofdatacleansingmayinvolveremovingtypographicalerrorsorvalidatingandcorrectingvaluesagainstaknownlistofentities.Thevalidationmaybestrict(suchasrejectinganyaddressthatdoesnothaveavalidpostalcode)orfuzzy(suchascorrectingrecordsthatpartiallymatchexisting,knownrecords).Adata entry clerkisamemberofstaffwhoreadshand-writtenorprintedrecordsandtypesthemintoacomputer.Theyaresometimesemployedonatemporarybasis,butmostlargecompanieswhichhavelargeamountsofdatawillhireonanear-permanentbasis.Incomputerscience,data validationistheprocessofensuringthataprogramoperatesonclean,correctandusefuldata.Itusesroutines,oftencalledvalidationrulesorcheckroutines,thatcheckforcorrectness,meaningfulness,andsecurityofdatathatareinputtothesystem.Therulesmaybeimplementedthroughtheautomatedfacilitiesofadatadictionary,orbytheinclusionofexplicitapplicationprogramvalidationlogic.Incorrectdatavalidationcanleadtodatacorruptionorasecurityvulnerability.Datavalidationchecksthatdataarevalid,sensible,reasonable,andsecurebeforetheyareprocessed.Computer graphicsaregraphicscreatedusingcomputersand,moregenerally,therepresentationandmanipulationofpictorialdatabyacomputer.Thedevelopmentofcomputergraphics,orsimplyreferredtoasCG,hasmadecomputerseasiertointeractwith,andbetterforunderstandingandint
收藏 下载该资源
网站客服QQ:2055934822
金锄头文库版权所有
经营许可证:蜀ICP备13022795号 | 川公网安备 51140202000112号