Workflow extractors automatically convert complex, ugly, nested JSON responses from APIs into flattened CSV files for quick consumption. Since this process is automatic, a side effect is that your workflow results may contain many unneeded columns you don't care about.
In order to use this feature, we need to provide a "whitelist" of only the columns we want. To do this, we'll need to run an initial execution to see what the CSV file looks like, then select the columns we want and then configure our extractor.
1 - Run an Initial Workflow Execution
You can skip this step if you've already downloaded the output with the unneeded columns.
You'll first want to just run the workflow to see what the output looks like in CSV format. To do a quick run, we suggest using a Shared Proxy (to avoid the price of a dedicated proxy and waiting for one to launch), and set the "Maximum Requests" of the workflow to 10 (so it only performs 10 requests and then stops).
Execute the workflow and then take a look at the output files - the names of the files will correspond to the extractors set in the workflow. E.g. in this workflow we have 2 extractors set, meaning we'll get back 2 CSV files:
In this case, we'll take a look at the
YouTube_Comments.csv file, corresponding to the "YouTube Comments" extractor above. We can see in the CSV file that a lot of the columns are repetitive and unneeded, and we really may only want columns K and M:
To restrict the output to only these columns, we need to copy the header names of these columns exactly, which in this case are:
2 - Enter the Column Whitelist in the Extractor
Now that we have the names of the columns we want, we can enter them into our whitelist for the extractor. We first need to click on the extractor that corresponds to our file by matching the name, e.g. the
YouTube_Comments.csv file will match to the "YouTube Comments" extractor in this example, so we'll click on that:
Next, click the "Edit" button and you'll be able to edit the extractor. Look for the "Output Field Whitelist" section and paste in the column names that you want:
Hit "Save Changes" and you're all set. Now run the workflow again and next time you'll only get back the columns you entered.