Please start any new threads on our new
site at https://forums.sqlteam.com. We've got lots of great SQL Server
experts to answer whatever question you can come up with.
Author |
Topic |
aakcse
Aged Yak Warrior
570 Posts |
Posted - 2010-03-22 : 15:53:13
|
Hi all,I have loaded a file content in to a single table with single column,the file is having data as belowRecord 1. ABC|AVC1|1234| xyz|abcRecord 2. ABC1|AVC1|12334| xyz |ABCRecord 3. ABC1|AVC1|12334|xyz|ABC I need to correct the above data in the table, the records 3. is the correct format which I want record1 and record2 to be like.Note Record1..3. text I have added above. ( this is just an e.g. actual table has 5million rec) Can any one of you help me in correcting this file using plsql or sql query by creating another table from this one, which has corrected dataLogic I am think is Every records must have 4 delimiter if less then bring the next line record to the previous line e.g.Record 1. ABC|AVC1|1234|xyz|abcEven if there are no delimiter on the line then also bring the next line record to the previous line e.g.Record 2. ABC1|AVC1|12334| xyz|ABCLike wise cleaning the entire table and placing it in a new table, to export. |
|
aakcse
Aged Yak Warrior
570 Posts |
Posted - 2010-03-22 : 15:55:00
|
What we can do is, count the no of delimiter in the row ( it is 11 actually) if it is 4 say then the row has correct data in it, only possiblity would be as below 11111|aaaaa|4455|45AAABBBCorrected one would be like this, with or with out space is ok for me11111|aaaaa|4455|45AAA BBBFor relation ship we can add a row number another column as a primary key auto increatemet, assume that this is added for all the rows ( as below)1 11111|aaaaa|4455|45AAA2 BBB |
|
|
aakcse
Aged Yak Warrior
570 Posts |
Posted - 2010-03-22 : 15:56:40
|
This works in Oracle as below, If some one can help me in writing a procedure or converting the below query in sql would be of help. 1 with t as ( select 1 as rn, 'ABC|AVC1|1234|' as txt from dual union all 2 select 2, 'xyz|abc' from dual union all 3 select 3, 'ABC1|AVC1|12334|' from dual union all 4 select 4, 'xyz' from dual union all 5 select 5, '|ABC' from dual union all 6 select 6, 'ABC1|AVC1|12334|xyz|ABC' from dual) 7 -- END OF TEST DATA 8 select rn, txt 9 from ( 10 select rn, regexp_substr(txt, '^[^|]+\|[^|]+\|[^|]+\|[^|]+\|[^|]+') as txt 11 ,case when s < nvl(lag(s) over (order by rn),999) then 1 else 0 end as s 12 from ( 13 select rn, replace(txt||'|'||lead(txt) over (order by rn) 14 ||'|'||lead(txt,2) over (order by rn) 15 ||'|'||lead(txt,3) over (order by rn) 16 ||'|'||lead(txt,4) over (order by rn),'||','|') as txt 17 ,mod(sum(length(regexp_replace(txt,'[^|]'))) over (order by rn),5) as s 18 from t 19 ) 20 ) 21 where s = 1 22* order by rn RN TXT---------- ------------------------------ 1 ABC|AVC1|1234|xyz|abc 3 ABC1|AVC1|12334|xyz|ABC 6 ABC1|AVC1|12334|xyz|ABC replacing "t" in the main query with the name of your own table, and removing all the above lines from 1 till 7 will help in oracle but it need it to be done in sql |
|
|
aakcse
Aged Yak Warrior
570 Posts |
Posted - 2010-03-22 : 16:09:35
|
I have done till here, can any one help with proceeding further..(1)select * into upload_bkup from uploadselect * from upload(2)alter table upload add dtt_id1 int identity(1,1) (3)select len(column0) - len(replace(column0,'|','')) from upload -- with 11 delimiter is the correct one(4)delete upload where isnull(column0,'')=''--drop the previous identity and created new one retaining data from the previous identity column dt_id1alter table upload add dtt_id2 int identity(1,1)(5) The below query gives you all the records which went to next line with out delimiterselect a.dtt_id2 dtt_id2,b.dtt_id1 dtt_id1,a.column0+' '+b.column0 column0 into #upload from upload a,(select b.dtt_id1,b.dtt_id2,b.column0 from upload b where b.column0 not like '%|%')bwhere b.dtt_id2 = a.dtt_id2 + 1select * from #upload(6) consolidating all the records in to one tableselect a.* into upload_now_clean_delimiter from (select a.dtt_id2, a.column0 from upload a where a.dtt_id2 not in( select dtt_id2 from #upload) and a.column0 like '%|%' union select dtt_id2,column0 from #upload)aselect * from upload_now_clean_delimiter(7) now clean the records which are coming with delimiter to next line |
|
|
aakcse
Aged Yak Warrior
570 Posts |
Posted - 2010-03-24 : 07:04:37
|
Here is the code which works, for the above let me know if any one finds any issue for this codeDECLARE @X as table(rn int identity(1,1),txt varchar(max))INSERT INTO @X(txt)select 'ABC|AVC1|1234|' union allselect 'xyz|abc' union allselect 'ABC1|AVC1|12334|' union allselect 'x' union allselect 'y' union allselect 'z' union allselect '|ABC' union allselect 'ABC1|AVC1|12334|xyz|ABC' union all select 'XYZ'DECLARE @max bigint,@c bigintSELECT @max = MAX(rn),@c = 1 from @XDECLARE @targ varchar(max) WHILE @max <> 0 BEGIN SELECT @targ = txt FROM @X WHERE rn = @max IF (@targ not like '%|%|%|%|%') BEGIN UPDATE @X SET txt = txt+@targ WHERE rn = @max - 1 DELETE FROM @X WHERE rn = @max END SET @max = @max - 1ENDSELECT * FROM @X |
|
|
|
|
|
|
|