CSV Filter
CSV Filtering Tool
The CSV Filter Project is a command-line tool to filter out rows and/or columns from data files formatted according to CSV (comma-separated value) or similar conventions.
Row Filtering
Rows are filtered by giving regular expressions that match the contents of each particular fields in the CSV record.
The keyword specifies field that are to be kept. So to select
records whose "State" field has the value "Washington" and filter out
all others the following command can be used.
csv-filter --keep:State:Washington myfile.csv
If the --drop
parameter is used, then matching records are dropped rather than being kept.
If the field within a --keep
or -drop
is a *
character, then
the matching RE applies to all fields.
Column Filtering
Columns can be removed from the CSV data using the --cut
option,
naming one or more fields.
In all the options above, field numbers may be used in place of names.
Input Conventions
Your program should have some input conventions to alter the exact syntax of data records. For example, you should have an option to specify separation by semicolons (";") rather than commas.
Output
By default the extracted data should be written to stdout. However, you may also define a command-line option to generate output to a specific file.