How to remove reverse duplicates in datastage ?

Scenario:
Objective: The interface aims to consolidate the round-trip ticket cost of passengers.
Description: The source file is a .txt file, train_route_src.txt, consolidated with trains from a particular source to destination. There are many repetitions in the source file (Reverse Duplication: eg. BLR --> BBS and BBS --> BLR). Remove the reverse duplicates and maintain the target in the following manner.
SRC-DES-SRC Distance Fare
SRC1-DES1-SRC1 2*Distance 2*fare
Source File:
SOURCE DESTINATION DISTANCE FARE
SBC BBS 1500 1500
GWH PUN 1750 1800
PUR CTC 90 90
FAR GHZ 750 750
BBS SBC 1500 1500
PUN GWH 1750 1800
CTC PUR 90 90
SIM OOT 500 500
GHZ FAR 750 750
OOT SIM 500 500
BRD REI 900 900
SRI CHN 4500 4500
REI BRD 900 900
GHN MNG 1650 1650
HWH BBS 550 550
MNG GHN 1650 1650
BBS HWH 550 550
CHN SRI 4500 4500
BBS GHN 1800 1800
BBS OOT 1500 1500
BBS DEL 2200 2200
OOT BBS 1500 1500
DEL BBS 2200 2200

Interview Candidate
Apr 9th, 2015
18
21603

DataStage

Answer

Showing Answers 1 - 18 of 18 Answers

D Hemakumar

Apr 24th, 2015

source---->columngenarator--------->transformer---->target
generate the two columns col1 and col2 and generate the sequence numbers for both columns in transformer write constraint col1>col2
take the source destination and distancefare columns in the output link

i hope this may work

Pavan Kumar

May 7th, 2015

Source >> Pivot enterprise {(horizontal pivot) combine both source and destination} >> remove duplicate stage >> Pivot enterprise stage {(vertical pivote) group by fare and set array size to 2, this will split the data into two columns (source and destination column)} >> Dataset/ Fileset.

sreeja

Jul 15th, 2015

source -> Sort -> filter > output
We can have sort stage and set the property create key change column give it a name "KeyChange".Map this column to output link.
Put a filter stage after sort satge and write a where condition KeyChange =1 and set output Reject =False
Hope this will work

bimaljsr@gmail.com

Aug 10th, 2015

See if this works

Code
 


SELECT a.Source1,a.destination,b.destination, 2*a.distance,2*a.fare 


FROM train A, train B


WHERE (A.Source1=B.destination


AND A.destination=B.Source1)

sreenivas

Jun 21st, 2016

source
**************
city1,city2,distance
hyd,pune,500
del,mum,1000
hyd,pune,500
blore,viza,700
del,mum,1000
mum,pune,300
target
*******************
city1,city2,distance
hyd,pune,500
del,mum,1000
blore,viza,700
mum,pune300
ANS) seq---->Pivot---->RD----->Pivot----->DS
first take PIVOT Horizontal -->pivot properties -->city1,city2(derivation)
In RD we can give combination of key =city & key=distance
In Pivot Vertical select Group by Distance & Pivot city.
we will get out put
i tried this its me working fine..

Naga Prasad

Jul 7th, 2016

I Think Sreenivass soultion will work.

Kumar Sonal

Jul 8th, 2016

1.Take another column called KEY in transformer. Derivation for key will be like If Source
2. Use sort stage to remove duplicate values on the basis of key column derived in transformer

Alam

Feb 15th, 2018

Ans--> Seq_File------>Sort------>Filter------->Seq_file
Sort Stage--->Generate key change column(it return for first value 1 and for duplicate 0) then used filter stahe
Filter Stage----> Filter by Key change column=1.
I think you will get the result.

Bhargav

Mar 2nd, 2018

I don't think you got the output for city in the format of SRC1-DES1-SRC1 (I mean two city names)? I tired your way but didn't get the expected output.

How to remove reverse duplicates in datastage ?

D Hemakumar

Pavan Kumar

sreeja

bimaljsr@gmail.com

sreenivas

Naga Prasad

Kumar Sonal

Alam

Bhargav

Give your answer:

Related Answered Questions

Related Open Questions

Latest News

It looks like you are using an AD Blocker!

Login

How to remove reverse duplicates in datastage ?

D Hemakumar

Pavan Kumar

sreeja

bimaljsr@gmail.com

sreenivas

Naga Prasad

Kumar Sonal

Alam

Bhargav

Give your answer:

Related Answered Questions

Related Open Questions

Latest News

It looks like you are using an AD Blocker!