Please start any new threads on our new site at https://forums.sqlteam.com. We've got lots of great SQL Server experts to answer whatever question you can come up with.

 All Forums
 SQL Server 2000 Forums
 SQL Server Development (2000)
 XML Bulk importing - need help with schema file

Author  Topic 

crudmop
Starting Member

14 Posts

Posted - 2006-03-10 : 15:19:06
I am currently running a convoluted process to import large xml files - the process, though it works, is slowly showing me the idosyncracies with openXML and I want to switch over to bulkload to manage the imports.

The files can go anywhere from small (1meg, 20meg, 40 meg) up to 400+ meg. To handle the large sizes, the current process sweeps a directory, finds the xml files, breaks them apart to 10000 records or less, and recreates smaller xml files which are then imported via openxml.

The xml docs are imported into a single table.


The xml doc is as follows:

-----------------------------------
<?xml version="1.0" encoding="ISO-8859-1"?>
<!DOCTYPE broadWorksCDR>
<broadWorksCDR version="11.1">
<cdrData>
<headerModule>
<recordId>
<eventCounter> </eventCounter>
<systemId> </systemId>
<date> </date>
<systemTimeZone> </systemTimeZone>
</recordId>
<serviceProvider> </serviceProvider>
<type> </type>
</headerModule>
<basicModule>
<userNumber> </userNumber>
<groupNumber> </groupNumber>
<direction> </direction>
<callingNumber> </callingNumber>
<callingPresentationIndicator> </callingPresentationIndicator>
<dialedDigits> </dialedDigits>
<calledNumber> </calledNumber>
<networkTranslatedNumber> </networkTranslatedNumber>
<networkTranslatedGroup> </networkTranslatedGroup>
<startTime> </startTime>
<userTimeZone> </userTimeZone>
<localCallId> </localCallId>
<remoteCallId> </remoteCallId>
<answerIndicator> </answerIndicator>
<answerTime> </answerTime>
<releaseTime> </releaseTime>
<terminationCause> </terminationCause>
<carrierIdentificationCode> </carrierIdentificationCode>
<callCategory> </callCategory>
<networkCallType> </networkCallType>
<chargeIndicator> </chargeIndicator>
<typeOfNetwork> </typeOfNetwork>
<releasingParty> </releasingParty>
</basicModule>
<centrexModule>
<group> </group>
<department> </department>
<accountCode> </accountCode>
<authorizationCode> </authorizationCode>
<callingPartyCategory> </callingPartyCategory>
<originalCalledNumber> </originalCalledNumber>
<originalCalledPresentationIndicator> </originalCalledPresentationIndicator> <originalCalledReason> </originalCalledReason>
<redirectingNumber> </redirectingNumber>
<redirectingPresentationIndicator> </redirectingPresentationIndicator>
<redirectingReason> </redirectingReason>
<serviceExtensionList> <serviceExtension>
<serviceName> </serviceName>
<invocationTime> </invocationTime>
<callId> </callId>
<to> </to>
<from> </from>
<conferenceId> </conferenceId>
<role> </role>
<bridge> </bridge>
<owner> </owner>
<ownerDN> </ownerDN>
<title> </title>
<projectCode> </projectCode>
<facResult> </facResult>
</serviceExtension> </serviceExtensionList>
</centrexModule>
<ipModule>
<route> </route>
<networkCallID> </networkCallID>
<codec> </codec>
<AccessDeviceAddress> </AccessDeviceAddress>
<correlationInfo>
<key> </key>
<creator> </creator>
<originatorNetwork> </originatorNetwork>
<terminatorNetwork> </terminatorNetwork>
</correlationInfo>
</ipModule>
<partialCallBeginModule>
<failoverCorrelationId> </failoverCorrelationId>
</partialCallBeginModule>
<partialCallEndModule>
<failoverCorrelationId> </failoverCorrelationId>
</partialCallEndModule>
</cdrData>
</broadWorksCDR>
----------------------------


With my schema (at this point):


--------------------------------
<?xml version="1.0" ?>
<Schema xmlns="urn:schemas-microsoft-com:xml-data"
xmlns:dt="urn:schemas-microsoft-com:xml:datatypes"
xmlns:sql="urn:schemas-microsoft-com:xml-sql" >

<ElementType name="eventCounter" dt:type="string" />
<ElementType name="systemId" dt:type="string" />
<ElementType name="date" dt:type="string" />
<ElementType name="systemTimeZone" dt:type="string" />
<ElementType name="userNumber" dt:type="string" />
<ElementType name="groupNumber" dt:type="string" />
<ElementType name="direction" dt:type="string" />
<ElementType name="callingNumber" dt:type="string" />
<ElementType name="callingPresentationIndicator" dt:type="string" />
<ElementType name="dialedDigits" dt:type="string" />
<ElementType name="calledNumber" dt:type="string" />
<ElementType name="networkTranslatedNumber" dt:type="string" />
<ElementType name="networkTranslatedGroup" dt:type="string" />
<ElementType name="startTime" dt:type="string" />
<ElementType name="userTimeZone" dt:type="string" />
<ElementType name="localCallId" dt:type="string" />
<ElementType name="remoteCallId" dt:type="string" />
<ElementType name="answerIndicator" dt:type="string" />
<ElementType name="answerTime" dt:type="string" />
<ElementType name="releaseTime" dt:type="string" />
<ElementType name="terminationCause" dt:type="string" />
<ElementType name="carrierIdentificationCode" dt:type="string" />
<ElementType name="callCategory" dt:type="string" />
<ElementType name="networkCallType" dt:type="string" />
<ElementType name="chargeIndicator" dt:type="string" />
<ElementType name="typeOfNetwork" dt:type="string" />
<ElementType name="releasingParty" dt:type="string" />


<ElementType name="broadWorksCDR" sql:is-constant="1">
<element type="cdrData" />
</ElementType>

<ElementType name="cdrData" sql:is-constant="1">
<element type="headerModule" />
<element type="basicModule" />
</ElementType>

<ElementType name="headerModule" sql:is-constant="1">
<element type="recordId" />
</ElementType>

<ElementType name="recordId" sql:relation="cdrdata">
<element type="eventCounter" sql:field="eventcounter" />
<element type="systemId" sql:field="systemId" />
<element type="date" sql:field="date" />
<element type="systemTimeZone" sql:field="systemtimezone" />
</ElementType>

<ElementType name="serviceProvider" sql:relation="cdrdata">
<element type="serviceProvider" sql:field="serviceProvider" />
</ElementType>

<ElementType name="type" sql:relation="cdrdata">
<element type="type" sql:field="type" />
</ElementType>

<ElementType name="basicModule" sql:relation="cdrdata">
<element type="userNumber" sql:field="userNumber" />
<element type="groupNumber" sql:field="groupNumber" />
<element type="direction" sql:field="direction" />
<element type="callingNumber" sql:field="callingNumber" />
<element type="callingPresentationIndicator" sql:field="callingPresentationIndicator" />
<element type="dialedDigits" sql:field="dialedDigits" />
<element type="calledNumber" sql:field="calledNumber" />
<element type="networkTranslatedNumber" sql:field="networkTranslatedNumber" />
<element type="networkTranslatedGroup" sql:field="networkTranslatedGroup" />
<element type="startTime" sql:field="startTime" />
<element type="userTimeZone" sql:field="userTimeZone" />
<element type="localCallId" sql:field="localCallId" />
<element type="remoteCallId" sql:field="remoteCallId" />
<element type="answerIndicator" sql:field="answerIndicator" />
<element type="answerTime" sql:field="answerTime" />
<element type="releaseTime" sql:field="releaseTime" />
<element type="terminationCause" sql:field="terminationCause" />
<element type="carrierIdentificationCode" sql:field="carrierIdentificationCode" />
<element type="callCategory" sql:field="callCategory" />
<element type="networkCallType" sql:field="networkCallType" />
<element type="chargeIndicator" sql:field="chargeIndicator" />
<element type="typeOfNetwork" sql:field="typeOfNetwork" />
<element type="releasingParty" sql:field="releasingParty" />
</ElementType>

</Schema>
------------------------



The issue: in it's current tested but incomplete form, the data imports - but it loads the headermodule data in 1000 rows, then loads the basicmodule data in the next 1000 rows - although all should be loaded together. I know this is caused by the sublevels of children for the nodes that have them, but I cannot seem to find out how I should have the schema designed to get this working right.

If anyone has any suggestions, I could sure use them - I need to get this implemented asap. I'll even wash your car and stuff.

Thanks in advance for any help you might have.

   

- Advertisement -