Submitted Questions

  • Sequential file with Duplicate Records

    A sequential file has 8 records with one column, below are the values in the column separated by space,1 1 2 2 3 4 5 6In a parallel job after reading the sequential file 2 more sequential files should be created, one with duplicate records and the other without duplicates.File 1 records separated by space: 1 1 2 2File 2 records separated by space: 3 4 5 6How will you do it

    Ram

    • Jul 1st, 2016

    Hi Pooja, Its absolutely possible.. Src --> Copy(linksort) ---> Aggr(count rows) another link from copy -----------------> Join (Copy & aggr) ---> Filter(count=1 for trg1 and co...

    Pooja Trivedi

    • Jun 30th, 2016

    This will not give the desired output as the we want the duplicate records also n number of times where n is the number of record present in the file.

  • Lookup Stage Partitioning

    With a Lookup stage the input stage has 1million records and reference link has 1million records. What is the best Partition method and why?