Sunday 21 February 2021

Replacing CRLF from files

Remove CRLF from files


The following is a number of methods you can use to strip ^M from files.

This was more of an exercise than a feature I need regularly. However, there was a recent incident at work where someone had checked-in such an abomination. Some tools just don't like Windows line endings in files, ie. CRLF, Carriage Return Line Feed. So, as an exercise, here is a collation of the numerous ways to fix these aberrations ...

Note Most editors have a quick way of doing this.

file

You can see whether a file has ^M using the file command:

Example

Consider a file containing ^Mfile reports:

$ file test.txt
test.txt: ASCII text, with CRLF, LF line terminators

After stripping ^M, we have:

$ file test-fixed.txt
test-fixed.txt: ASCII text

dos2unix

Probably the simplest way is to use the dos2unix command.

Example

Show file has ^M line endings:

$ dos2unix -i test.txt
       3      16       0  no_bom    text    test.txt

Now fix:

$ dos2unix test.txt

Proof that file has been fixed:

$ dos2unix -i test.txt
       0      19       0  no_bom    text    test.txt

Haskell

There is a simple Haskell program to do this:

fixcrlf.hs
#!/usr/bin/env runhaskell
 
import           System.Environment (getArgs, getProgName)
 
main = getArgs >>= \args -> case length args of
  1 -> mapM_ (putStrLn . filter (/= '\r')) . lines =<< readFile (head args)
  _ -> getProgName >>= \p -> error $ "Usage: " ++ p ++ " [file_name]"

Example

$ ./fixcrlf.hs test.txt > test-fixed.txt
$ file test.txt test-fixed.txt
test.txt:       ASCII text, with CRLF line terminators
test-fixed.txt: ASCII text

Perl

Using Perl:

$ perl -pi~ -e 's/^M//g' source.file

Where ^M is a control character entered by CTRL-v followed by CTRL-m.

This will keep a backup with the original file saved to source.file~.

Example

$ perl -pi~ -e 's/^M//g' test.txt
$ file test.txt~ test.txt
test.txt~: ASCII text, with CRLF line terminators
test.txt:  ASCII text

Python

Using Python:

fixcrlf.py
#!/usr/bin/env python3
 
import sys
 
if len(sys.argv) == 2:
    with open(sys.argv[1], "r") as f:
        for line in f:
            print(line.rstrip('\r\n'))
else:
    print(f"Usage: {sys.argv[0]} [file-to-fix] > [fixed-file]")

Example

$ ./fixcrlf.py test.txt > test-fixed.txt
$ file test.txt test-fixed.txt
test.txt:       ASCII text, with CRLF line terminators
test-fixed.txt: ASCII text

Ruby

Using Ruby:

fixcrlf.rb
#!/usr/bin/env ruby
 
raise "Usage: #{$PROGRAM_NAME} [file-to-fix] > [fixed-file]" if ARGV.length != 1
 
File.foreach(ARGV[0]) { |line| puts line.chop }

Example

$ ./fixcrlf.rb test.txt > test-fixed.txt 
$ file test.txt test-fixed.txt
test.txt:       ASCII text, with CRLF line terminators
test-fixed.txt: ASCII text

Using shell tools

Using the translate tool, tr:

$ cat source.file | tr -d '\r' > source.file.fixed

This can also be used in a more general fashion to remove non-printable characters from a file:

tr -dc '[:print:]\n' < source.file > source.file.changed

Example

$ cat test.txt | tr -d '\r' > test-fixed.txt 
$ file test.txt test-fixed.txt
test.txt:       ASCII text, with CRLF line terminators
test-fixed.txt: ASCII text

Friday 18 September 2020

Magic Triangle - Solution

I first saw this puzzle in CSIRO’s Double Helix here.

The problem we are solving is described as:

Given a triangle with circle on each point and on each side:

I first saw this puzzle in CSIRO’s Double Helix here.

The Magic Triangle problem we are solving is described as:

You are given a triangle with circle on each point and on each side:

Magic Triangle

Magic Triangle

Then, using the numbers from 1 to 6, arrange them in a triangle with three numbers on each side. Swap them around until the sides all add up to the same number.

Finally, sum each side to 10.

Method

Let’s label the triangle: starting from any vertex label the nodes:

Labelled Magic Triangle

Labelled Magic Triangle

The method to solve this problem is broken into the following steps:

  • get all permutations of numbers 1 to 6 as a, b, c, d, e, f

  • filter permutation to satisfy conditions:

    • a + b + c == c + d + e == e + f + a

    • and final condition: a + b +c == 10

Using Haskell

All permutations of numbers 1 to 6:

This will give 6! = 720 permutations.

Filter on sides summing up to the same value:

Which gives all solutions where the sides are equal sums:

  [(3,2,5),(5,4,1),(1,6,3)]
  [(2,4,3),(3,5,1),(1,6,2)]
  [(1,5,3),(3,4,2),(2,6,1)]
  [(1,4,5),(5,2,3),(3,6,1)]
  [(5,3,4),(4,2,6),(6,1,5)]
  [(6,2,4),(4,3,5),(5,1,6)]
  [(4,5,2),(2,3,6),(6,1,4)]
  [(6,3,2),(2,5,4),(4,1,6)]
  [(2,3,6),(6,1,4),(4,5,2)]
  [(4,1,6),(6,3,2),(2,5,4)]
  [(3,4,2),(2,6,1),(1,5,3)]
  [(1,6,2),(2,4,3),(3,5,1)]
  [(5,4,1),(1,6,3),(3,2,5)]
  [(4,3,5),(5,1,6),(6,2,4)]
  [(6,1,5),(5,3,4),(4,2,6)]
  [(3,6,1),(1,4,5),(5,2,3)]
  [(5,2,3),(3,6,1),(1,4,5)]
  [(2,6,1),(1,5,3),(3,4,2)]
  [(3,5,1),(1,6,2),(2,4,3)]
  [(1,6,3),(3,2,5),(5,4,1)]
  [(5,1,6),(6,2,4),(4,3,5)]
  [(2,5,4),(4,1,6),(6,3,2)]
  [(6,1,4),(4,5,2),(2,3,6)]
  [(4,2,6),(6,1,5),(5,3,4)]

Filter on sides summing to 10:

Which gives our final list of solutions:

  [(3,2,5),(5,4,1),(1,6,3)]
  [(1,4,5),(5,2,3),(3,6,1)]
  [(5,4,1),(1,6,3),(3,2,5)]
  [(3,6,1),(1,4,5),(5,2,3)]
  [(5,2,3),(3,6,1),(1,4,5)]
  [(1,6,3),(3,2,5),(5,4,1)]

Note here that the solutions aren’t unique: there are repetitions if you consider rotations or node reversals. Can we filter these out to get the only unique solution?

Try ordering:

The idea here is that as the nodes are unique, we can order them. This yields our final solution:

  [(5,2,3),(3,6,1),(1,4,5)]

Using one other Haskell refinement we can write this as:

Solved Magic Triangle

Solved Magic Triangle

Check your answer on CSIRO page here.

Using Python

Python now has list comprehensions just like many other programming languages, so the solution is much the same. Also, use the built-in permutations function from itertools:

Which yields the same results as our previous solution in Haskell:

>>> [[(5, 2, 3), (3, 6, 1), (1, 4, 5)]]