Big Data Hadoop & Spark

Converting CSV to JSON Using Pig

In this blog we will be discussing about how to convert CSV data to JSON.

Let us have the CSV data which available by converting the XML data in our first case.

A = LOAD '/pig_conversions/xml_to_csv/part-m-00000' USING PigStorage(',') as (name:chararray,value:chararray);

The above command will load the converted XML to CSV file into pig using PigStorage which is delimited by ‘,‘.

Now we will store this data with JSON format using the JsonStorage API available in pig.

STORE A INTO '/pig_conversions/csv_to_json' USING org.apache.pig.builtin.JsonStorage();

We have successfully stored the CSV data in JSON format. It will be stored in the location /pig_conversions/csv_to_json

Hadoop

In this location we can see four files, one is Success file, part-m-00000 file which consists of output and Schema file Which consists of the JSON schema and another one is the HEADER containing the column name.

Once after downloading these file, we can see that inside pig header we have two columns which are given while loading the data.

Now if we download and view the schema file with name pig_schema, we can view the following data.

The below schema will be generated automatically by the JsonStorage API.

Newly created JSON data can be retrieved from the part file.

Now let us see the contents of the part-m-00000 file.

We hope this blog helped you in learning how to convert CSV data into JSON format using pig.

Keep visiting our website Acadgild for more updates on Big Data and other technologies. Click here to learn Big Data Hadoop Development.

Hadoop

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Related Articles

Close