diff options
Diffstat (limited to '25_break_encr/README')
| -rw-r--r-- | 25_break_encr/README | 51 | 
1 files changed, 51 insertions, 0 deletions
| diff --git a/25_break_encr/README b/25_break_encr/README new file mode 100644 index 0000000..fdbc68d --- /dev/null +++ b/25_break_encr/README @@ -0,0 +1,51 @@ + In the last problem, you saw an implementation of a simple +"encryption" program.  In this problem, you will write a program +that breaks that encryption---that is, it will take as input +a text file encrypted with the encryption program you just +saw, and prints out the key used to encrypt it! + +Breaking the Caesar Cipher uses a technique called "frequency counting." +This technique relies on the fact that the distribution of letters +in the English alphabet is far from uniform: 'e' is by far the most +common letter (~13%), followed by 't' (9%), and 'a' (8%).  Note +that the average frequency is 100/26 ~= 4%.  + +This frequency distribution means that if you know (or suspect) +that a file contains English text encrypted with a Caesar Cipher, +you can simply count the frequency of all letters in it, and guess +that the letter which occurs most often is 'e'.   Once you know +which letter is 'e', you can backsolve for the key and decrypt +the file.  Note that in practice this requires a large enough +text that "the law of large numbers" applies---and while it is +not guaranteed to work, it typically does. + + +Requirements: +============= + + - Your program will take one command line argument: the name +   of a file to read as input + - Your program will then perform frequency counting on the letters +   in that text file.  Your program should ignore all non-letter +   characters (those un-modified by your problem 1 program). + - Your program should use the frequency count information to  +   determine which letter is 'e', and solve for the key. + - On success, your program should print a single line of output to +   stdout, which should contain one decimal (base 10) integer (followed by a +   newline character).  This number should be the encryption key used +   on the text.  It should be in the range [0,26).  That is, the +   number you print should obey 0 <= answer < 26. + - On failure, your program should print an appropriate error +   message to stderr, then exit with EXIT_FAILURE. + - Provide a Makefile which compiles your program into +   a binary called "breaker" +Hints: + - Divide this problem into sub-problems.  You should probably +   write at least two functions (other than main) to do this. + - Think about how you want to represent your data.  An array +   might be handy somewhere. + - You may find a function you wrote in a previous classwork  +   useful for this assignment.  Feel free to use or adapt it. + - Remember that everything (including characters) are numbers. +   You might look back at the "Read encryption" assignment +   to see an example of doing math on characters. | 
