Please start any new threads on our new
site at https://forums.sqlteam.com. We've got lots of great SQL Server
experts to answer whatever question you can come up with.
| Author |
Topic |
|
crudmop
Starting Member
14 Posts |
Posted - 2006-03-10 : 15:19:06
|
| I am currently running a convoluted process to import large xml files - the process, though it works, is slowly showing me the idosyncracies with openXML and I want to switch over to bulkload to manage the imports. The files can go anywhere from small (1meg, 20meg, 40 meg) up to 400+ meg. To handle the large sizes, the current process sweeps a directory, finds the xml files, breaks them apart to 10000 records or less, and recreates smaller xml files which are then imported via openxml. The xml docs are imported into a single table.The xml doc is as follows:-----------------------------------<?xml version="1.0" encoding="ISO-8859-1"?> <!DOCTYPE broadWorksCDR> <broadWorksCDR version="11.1"> <cdrData> <headerModule> <recordId> <eventCounter> </eventCounter> <systemId> </systemId> <date> </date> <systemTimeZone> </systemTimeZone> </recordId> <serviceProvider> </serviceProvider> <type> </type> </headerModule> <basicModule> <userNumber> </userNumber> <groupNumber> </groupNumber> <direction> </direction> <callingNumber> </callingNumber> <callingPresentationIndicator> </callingPresentationIndicator> <dialedDigits> </dialedDigits> <calledNumber> </calledNumber> <networkTranslatedNumber> </networkTranslatedNumber> <networkTranslatedGroup> </networkTranslatedGroup> <startTime> </startTime> <userTimeZone> </userTimeZone> <localCallId> </localCallId> <remoteCallId> </remoteCallId> <answerIndicator> </answerIndicator> <answerTime> </answerTime> <releaseTime> </releaseTime> <terminationCause> </terminationCause> <carrierIdentificationCode> </carrierIdentificationCode> <callCategory> </callCategory> <networkCallType> </networkCallType> <chargeIndicator> </chargeIndicator> <typeOfNetwork> </typeOfNetwork> <releasingParty> </releasingParty> </basicModule> <centrexModule> <group> </group> <department> </department> <accountCode> </accountCode> <authorizationCode> </authorizationCode> <callingPartyCategory> </callingPartyCategory> <originalCalledNumber> </originalCalledNumber> <originalCalledPresentationIndicator> </originalCalledPresentationIndicator> <originalCalledReason> </originalCalledReason> <redirectingNumber> </redirectingNumber> <redirectingPresentationIndicator> </redirectingPresentationIndicator> <redirectingReason> </redirectingReason> <serviceExtensionList> <serviceExtension> <serviceName> </serviceName> <invocationTime> </invocationTime> <callId> </callId> <to> </to> <from> </from> <conferenceId> </conferenceId> <role> </role> <bridge> </bridge> <owner> </owner> <ownerDN> </ownerDN> <title> </title> <projectCode> </projectCode> <facResult> </facResult> </serviceExtension> </serviceExtensionList> </centrexModule> <ipModule> <route> </route> <networkCallID> </networkCallID> <codec> </codec> <AccessDeviceAddress> </AccessDeviceAddress> <correlationInfo> <key> </key> <creator> </creator> <originatorNetwork> </originatorNetwork> <terminatorNetwork> </terminatorNetwork> </correlationInfo> </ipModule> <partialCallBeginModule> <failoverCorrelationId> </failoverCorrelationId> </partialCallBeginModule> <partialCallEndModule> <failoverCorrelationId> </failoverCorrelationId> </partialCallEndModule> </cdrData> </broadWorksCDR>----------------------------With my schema (at this point):--------------------------------<?xml version="1.0" ?><Schema xmlns="urn:schemas-microsoft-com:xml-data" xmlns:dt="urn:schemas-microsoft-com:xml:datatypes" xmlns:sql="urn:schemas-microsoft-com:xml-sql" > <ElementType name="eventCounter" dt:type="string" /> <ElementType name="systemId" dt:type="string" /> <ElementType name="date" dt:type="string" /> <ElementType name="systemTimeZone" dt:type="string" /> <ElementType name="userNumber" dt:type="string" /> <ElementType name="groupNumber" dt:type="string" /> <ElementType name="direction" dt:type="string" /> <ElementType name="callingNumber" dt:type="string" /> <ElementType name="callingPresentationIndicator" dt:type="string" /> <ElementType name="dialedDigits" dt:type="string" /> <ElementType name="calledNumber" dt:type="string" /> <ElementType name="networkTranslatedNumber" dt:type="string" /> <ElementType name="networkTranslatedGroup" dt:type="string" /> <ElementType name="startTime" dt:type="string" /> <ElementType name="userTimeZone" dt:type="string" /> <ElementType name="localCallId" dt:type="string" /> <ElementType name="remoteCallId" dt:type="string" /> <ElementType name="answerIndicator" dt:type="string" /> <ElementType name="answerTime" dt:type="string" /> <ElementType name="releaseTime" dt:type="string" /> <ElementType name="terminationCause" dt:type="string" /> <ElementType name="carrierIdentificationCode" dt:type="string" /> <ElementType name="callCategory" dt:type="string" /> <ElementType name="networkCallType" dt:type="string" /> <ElementType name="chargeIndicator" dt:type="string" /> <ElementType name="typeOfNetwork" dt:type="string" /> <ElementType name="releasingParty" dt:type="string" /> <ElementType name="broadWorksCDR" sql:is-constant="1"> <element type="cdrData" /> </ElementType> <ElementType name="cdrData" sql:is-constant="1"> <element type="headerModule" /> <element type="basicModule" /> </ElementType> <ElementType name="headerModule" sql:is-constant="1"> <element type="recordId" /> </ElementType> <ElementType name="recordId" sql:relation="cdrdata"> <element type="eventCounter" sql:field="eventcounter" /> <element type="systemId" sql:field="systemId" /> <element type="date" sql:field="date" /> <element type="systemTimeZone" sql:field="systemtimezone" /> </ElementType> <ElementType name="serviceProvider" sql:relation="cdrdata"> <element type="serviceProvider" sql:field="serviceProvider" /> </ElementType> <ElementType name="type" sql:relation="cdrdata"> <element type="type" sql:field="type" /> </ElementType> <ElementType name="basicModule" sql:relation="cdrdata"> <element type="userNumber" sql:field="userNumber" /> <element type="groupNumber" sql:field="groupNumber" /> <element type="direction" sql:field="direction" /> <element type="callingNumber" sql:field="callingNumber" /> <element type="callingPresentationIndicator" sql:field="callingPresentationIndicator" /> <element type="dialedDigits" sql:field="dialedDigits" /> <element type="calledNumber" sql:field="calledNumber" /> <element type="networkTranslatedNumber" sql:field="networkTranslatedNumber" /> <element type="networkTranslatedGroup" sql:field="networkTranslatedGroup" /> <element type="startTime" sql:field="startTime" /> <element type="userTimeZone" sql:field="userTimeZone" /> <element type="localCallId" sql:field="localCallId" /> <element type="remoteCallId" sql:field="remoteCallId" /> <element type="answerIndicator" sql:field="answerIndicator" /> <element type="answerTime" sql:field="answerTime" /> <element type="releaseTime" sql:field="releaseTime" /> <element type="terminationCause" sql:field="terminationCause" /> <element type="carrierIdentificationCode" sql:field="carrierIdentificationCode" /> <element type="callCategory" sql:field="callCategory" /> <element type="networkCallType" sql:field="networkCallType" /> <element type="chargeIndicator" sql:field="chargeIndicator" /> <element type="typeOfNetwork" sql:field="typeOfNetwork" /> <element type="releasingParty" sql:field="releasingParty" /> </ElementType> </Schema>------------------------The issue: in it's current tested but incomplete form, the data imports - but it loads the headermodule data in 1000 rows, then loads the basicmodule data in the next 1000 rows - although all should be loaded together. I know this is caused by the sublevels of children for the nodes that have them, but I cannot seem to find out how I should have the schema designed to get this working right.If anyone has any suggestions, I could sure use them - I need to get this implemented asap. I'll even wash your car and stuff.Thanks in advance for any help you might have. |
|
|
|
|
|
|
|