diff options
author | Haidong Ji | 2022-04-15 15:51:30 -0500 |
---|---|---|
committer | Haidong Ji | 2022-04-15 15:51:30 -0500 |
commit | 442a49ad5a48d417345959b903ae6a6d32d55759 (patch) | |
tree | c7127bb497e5e439018b1915e0136eec2c9cb124 /25_break_encr/README |
Excellent fundamentals and displine training, many tools and techniques
exercises: gdb, emacs, valgrind, git
Diffstat (limited to '25_break_encr/README')
-rw-r--r-- | 25_break_encr/README | 51 |
1 files changed, 51 insertions, 0 deletions
diff --git a/25_break_encr/README b/25_break_encr/README new file mode 100644 index 0000000..fdbc68d --- /dev/null +++ b/25_break_encr/README @@ -0,0 +1,51 @@ + In the last problem, you saw an implementation of a simple +"encryption" program. In this problem, you will write a program +that breaks that encryption---that is, it will take as input +a text file encrypted with the encryption program you just +saw, and prints out the key used to encrypt it! + +Breaking the Caesar Cipher uses a technique called "frequency counting." +This technique relies on the fact that the distribution of letters +in the English alphabet is far from uniform: 'e' is by far the most +common letter (~13%), followed by 't' (9%), and 'a' (8%). Note +that the average frequency is 100/26 ~= 4%. + +This frequency distribution means that if you know (or suspect) +that a file contains English text encrypted with a Caesar Cipher, +you can simply count the frequency of all letters in it, and guess +that the letter which occurs most often is 'e'. Once you know +which letter is 'e', you can backsolve for the key and decrypt +the file. Note that in practice this requires a large enough +text that "the law of large numbers" applies---and while it is +not guaranteed to work, it typically does. + + +Requirements: +============= + + - Your program will take one command line argument: the name + of a file to read as input + - Your program will then perform frequency counting on the letters + in that text file. Your program should ignore all non-letter + characters (those un-modified by your problem 1 program). + - Your program should use the frequency count information to + determine which letter is 'e', and solve for the key. + - On success, your program should print a single line of output to + stdout, which should contain one decimal (base 10) integer (followed by a + newline character). This number should be the encryption key used + on the text. It should be in the range [0,26). That is, the + number you print should obey 0 <= answer < 26. + - On failure, your program should print an appropriate error + message to stderr, then exit with EXIT_FAILURE. + - Provide a Makefile which compiles your program into + a binary called "breaker" +Hints: + - Divide this problem into sub-problems. You should probably + write at least two functions (other than main) to do this. + - Think about how you want to represent your data. An array + might be handy somewhere. + - You may find a function you wrote in a previous classwork + useful for this assignment. Feel free to use or adapt it. + - Remember that everything (including characters) are numbers. + You might look back at the "Read encryption" assignment + to see an example of doing math on characters. |