In this blog we will see how to convert JSON format of data into CSV format.
We have created our own JSON format data from a CSV file using AVRO file format and we will be using the same JSON data in this blog.
You can refer the below blog to know how we have converted CSV to JSON using AVRO under the section Converting Avro to JSON
You can also download the dataset from this link.
We will now load the JSON data into pig using the below command
loadJson = LOAD '/olympic.json' USING JsonLoader('athelete:chararray,age:INT,country:chararray,year:chararray,closing:chararray,sport:chararray,gold:INT,silver:INT,bronze:INT,total:INT');
Pig provides API for loading Json format of data, Using the above command we can load the data into pig.
In this case, we are using JsonLoader() as our loader function .
Now we have successfully loaded the JSON data into pig, to convert it into CSV we just need to store the JSON data with CSV API provided by pig.
If we load JSON data using JSON loader, the data will be parsed automatically by the loader and will be visible as CSV format. You can see the output in the below screenshot.
STORE loadJson INTO '/pig_conversions/json_to_csv' USING org.apache.pig.piggybank.storage.CSVExcelStorage();
The above command will store the output using CSV storage available with pig.
You can download the CSV file from the location /pig_conversions/json_to_csv with name part-m-00000.
The output looks like this
By using this CSV format, performing analysis on the data becomes easier.
Hope this blog helped you in learning how to convert JSON format of data into CSV format using pig.