JetThoughts

Show not valid CSV lines with sed

September 21st 2010

I have trouble with invalid formatted CSV file. First step show lines with invalid lines.

1 sed -n '/"[^",]*"[^",]*"[^",]*",/,1p' <fileName>

Then find in the google way to replace symbol inside quotes. And read next manual http://sed.sourceforge.net/sed1line.txt. So create a sed script with next content, call it as script.sed:

1 s/\",\"/\$XXXX\$/g;
2 :a
3 s/\([^,]\)"\([^,]\)/\1'\2/g
4 ta
5 s/\$XXXX\$/\",\"/g;

Next we just do:

1 sed -f script.sed <fileName>

And we get in output a normal csv format file. Next we just add the argument to apply this in this file.

1 sed -i .bak -f script.sed <fileName>
blog comments powered by Disqus

Powered by Rackspace Cloud Computing