Big Data Hadoop & Spark

Converting JSON into CSV Using Pig

In this blog we will see how to convert JSON format of data into CSV format.
We have created our own JSON format data from a CSV file using AVRO file format and we will be using the same JSON data in this blog.

You can refer the below blog to know how we have converted CSV to JSON using AVRO under the section Converting Avro to JSON

Avro in Hive

You can also download the dataset from this link.

We will now load the JSON data into pig using the below command

loadJson = LOAD '/olympic.json' USING JsonLoader('athelete:chararray,age:INT,country:chararray,year:chararray,closing:chararray,sport:chararray,gold:INT,silver:INT,bronze:INT,total:INT');

Pig provides API for loading Json format of data, Using the above command we can load the data into pig.

In this case, we are using JsonLoader() as our loader function .

Now we have successfully loaded the JSON data into pig, to convert it into CSV we just need to store the JSON data with CSV API provided by pig.

Hadoop

If we load JSON data using JSON loader, the data will be parsed automatically by the loader and will be visible as CSV format. You can see the output in the below screenshot.

STORE loadJson INTO '/pig_conversions/json_to_csv' USING org.apache.pig.piggybank.storage.CSVExcelStorage();

The above command will store the output using CSV storage available with pig.

You can download the CSV file from the location /pig_conversions/json_to_csv with name part-m-00000.

The output looks like this

By using this CSV format, performing analysis on the data becomes easier.

Hope this blog helped you in learning how to convert JSON format of data into CSV format using pig.

Keep visiting our website Acadgild for more updates on Big Data and other technologies. Click here to learn Big Data Hadoop Development.

Hadoop

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Related Articles

Close
Close