Graph Performance

I have a master file with user personal details and around 50k users.
Each day a transaction file with about 100million records come in and we need to update transaction file with user details present in master file.
What is a good approach for this?

Questions by teegeo

Showing Answers 1 - 6 of 6 Answers

sriram

  • Dec 11th, 2013
 

join with old master and new master file. you will get matched record those are not going to update. unused records are new records ,which are loaded into table.

  Was this answer useful?  Yes

1)I partition the data first
2)Remove unwanted fields if applicable
3)Followed by sorting on the matching key in first data
if it can fit in memory without data spill
as it small and other i will use as the driving dataset
else I have to partition this one too
4)Join on account number or cust id update the existing fields on in1 with in0
Also we can add the additional fields from master file.

  Was this answer useful?  Yes

Give your answer:

If you think the above answer is not correct, Please select a reason and add your answer below.

 

Related Answered Questions

 

Related Open Questions