Big Data Hadoop & Spark

Macros In Pig

In this blog, we will discuss macros in Pig and ways to implement it. It is recommended to go through our previous blog on Beginner’s guide for Pig.

Macros were enabled in 0.9 version of Pig. Macros makes the Pig code more modular and makes Pig Latin code shareable.

Macros can be implemented in Pig in three ways by:

  • Defining Macros
  • Importing Macros
  • Expanding Macros

Let’s now look at the steps to implement macros by defining them.

Step 1: Select the input file as shown below and copy it to HDFS.

Step 2: Create a macro function to filter out the records with group id = 120.

DEFINE filter_op(pigrel_var,column_var) returns z{

$z = filter $pigrel_var by $column_var == 120;


The above macro takes two values as input, first is relation variable pigrel_var and second is column variable column_var.

In the above case, macro checks if column_var is equal to 120.

Step 3: C
reate the macro usage code as shown below:

a = load ‘/stud_performance’ using PigStorage(‘,’) as (rno:int,name:chararray,marks:int,group_id:int);

x = filter_op(a,group_id);

dump x;

Step 4: Write the macro creation code and macro usage code in the same file with the name, embed_macro.pig and then run file with -f option as shown below:

Step 5: Run the above file with -f option as shown below:

Step 6: The records with the group id as 120 are displayed.

We hope this blog helped you in understanding the implementation of macros by defining them. In our next blog, we will discuss the implementation of macros in Pig scripts by importing them. Keep visiting our site for more updates on Big data and other technologies.



Leave a Reply

Your email address will not be published. Required fields are marked *

Related Articles