What is the latest version that is available in ab-initio?
Output for sort and dedup sort with null key
I have file containing 5 unique rows and I am passing them through sort component using null key and and passing output of sort to dedup sort. What will happen, what will be the output.
Case:1 :If we can take null key in dedup sort also then output depend on keep parameter. keep: first: 1st record last: last record unique: 0 records Case 2: If we can take any ke...
Actually the thing is if u use sort component with {} key it will take alll values as key and pass the records as it is..and dedup will do pass the same if there is no duplicates
Why go for sort within groups?
We have sort and sort within groups components. We can achieve the sort within group functionality by placing two keys in sort group. Then why we have to go for sort within groups?
Usually when we use sort within groups we can sort the data by universally in the sense we can make sure pure sort by using defined keys....but if we use only sort we cant decide data dynamically and...
Sort with in group sort the data based on major & miner key in single time.
Sort component sort the data on a single key at a time.If we have sort with some other key again we have to use the sort component again.It take some extra time. That sort with in group using best way.
How can we test the abintio manually and automation?
Hi All,
I appreciate I am a late entry for this question. I think tools like Jenkins can be used for Ab Initio automation test harness. Please correct if I am wrong.
Yes thats correct. There is no automation testing available for Abinitio.
Manually U need to test the abinitio graphs using Validate category of components.
Thanks
Pulling out the records which are same in two records
I have two files and I want to compare those two records ,after comparing I want to pull out the records which are same in both records and I want the new record from the unmatched records record format is like this: decimal(3) cust_id; string(6) cust_name; string(" ") address; same...
Use Full outer join. Match the key using name and you can pull the matching and unmatched records
What is skew and skew measurement?
Skew of a partition is the amount by which its size deviates from the average partition size
statistically, skew represent distribution of data.. when all partitions share equal amount of data, it is the best use of portioning. This can be achieved by partition-by-roundrobbin or by using equ...
When using in-memory Join we choose one of the input as driving port which is not taken into memory rest all inputs are loaded into memory. Driving port should be the largest input taken. this can be set in the parameter.
The largest input is nothing but a driving port
If I have 2 files containing field file1(a,b,c) and file2(a,b,d), if we partition both the files on key a using partition by key and pass the output to join component, if the join key is (a,b) will it join or not and why?
Yes, this is going to work fine provided u do it as in-memory. Let me explain why, firstly whenever you are using the field A as a key, for the same data in the both the files, would definitely go int...
The partition key and join key do NOT have to be the exact same. In order to join properly, you just have to make sure the records being compared are in the same partition. So if the parti...
Abinitio display records between 50-75..
In input dataset I am having 100 records. I want records between 50-75 and I don't want to read 5th record? Which component I have to use..
sed -n 50,75p file1 > file2
use LEADING RECORDS component
with condition
next_in_sequence()>=50 && next_in_sequence()<=75
In my sandbox I am having 10 graphs, I checked-in those graphs into eme. Again I checked-out the graph and I do the modifications, I found out the modifications was wrong. What I have to do if I want to get the original graph..?
If you are not tagged the older version object (102).. and still you want to checkout the older version, than create a tag for the older version (102) by using r-tag option and do a checkout.
Just unlock the graphs!
How to get dml using utilities in UNIX?
m_db < dbc name > -table < table name >
we can use gendml command to generate the dml file, pls, see the below syntax of command
m_db gendml ur_dbc_file -table -select sql statement > dml file name
hope this would help you :-) :-)
If a graph fails in between loading 1 millon records to a target table what is the alternative solution? I.E will you run the grarh again? (the record count is very huge)
use checkpoints and partitions
We can commit intermediate results in the target table by creating a commit table in API mode. When we rerun the graph, it will skip over the previously commited records. Use m_db creat...
How does force_error function work ? If we set never abort in reformat , will force_error stop the graph or will it continue to process the next set of records ?
Forces error and writes the error note..
It will not stop the execution of graph,it will continue with the next records.
It will used especially to send to send the data to an error port when it does not meet the specified condition with error message given in this function. To abort the graph use force_abort function.
What is the relation between eme , gde and co-operating system ?
GdE runs on co op..whatever u run here on GDE runs with the help of co-op.
EME is for version control of GDE developed projects n plans.Its not that much necessary to have EME..i think..
GDE is Graphical Development Environment where user creates graphs(It is having GUI environment installed in wondows).Co>Op system is used for running the developed graphs either in Unix or in Wind...
m_dump is used to view data as in graphs
m_dump [-dml] [-file]
cat file.dml will work fine but it wont display the record format in a formatted way..You can think of m_dump as "view data in formatted " manner.
Abinitio graph phasing and checkpoints
What is phasing & checkpointing? What is the use ?
For efficient graph management to channelize the priority of the outputs we use phases
coming to check points,we use them to consume time that takes while running graphs..
By placing checkpoints graph resumes to run from its earliest checkpoint.
Phasing: Phasing in ab-initio means dividing the complex graphs into pieces. In order to improve the performance by reducing the resource utilization in different phases with in the same graph. Check...
What is max core value? Wat is the use of max core?
maximum memory used by a component
Max core is nothing but allocating the memory in the component.
Rollup, sort etc are some of the components
temp file
.rec file contains the information which is required to rollback the graph when it fails
Generally it will have some meta char information with hld and nld extension.
On failure we use to rollback using m_rollback to rollback the job to last committed check point.
What is the syntax of m_dump command?
m-dump to display record format:m_dump [-dml] [-datafile]
m_dump is abinitio utility to view the content of a file[Serial/MFS] in a formatted way.
Syntax :
m_dump
There are other options such as :
-select
-no-print-data
-print-data
-start # -end #
-record
etc..
What are $ai parameters in abinitio?
Sandbox Level Parameters
$AI parameters are sandbox level parameters like $AI_XFR, $AI_XML
Co -op : 3.1.5
GDE : 3.1.5
For any concern please check in online discussion browser.
3.1.4