summaryrefslogtreecommitdiff
path: root/25_break_encr/README
diff options
context:
space:
mode:
authorHaidong Ji2022-04-15 15:51:30 -0500
committerHaidong Ji2022-04-15 15:51:30 -0500
commit442a49ad5a48d417345959b903ae6a6d32d55759 (patch)
treec7127bb497e5e439018b1915e0136eec2c9cb124 /25_break_encr/README
Great C programming funHEADmaster
Excellent fundamentals and displine training, many tools and techniques exercises: gdb, emacs, valgrind, git
Diffstat (limited to '25_break_encr/README')
-rw-r--r--25_break_encr/README51
1 files changed, 51 insertions, 0 deletions
diff --git a/25_break_encr/README b/25_break_encr/README
new file mode 100644
index 0000000..fdbc68d
--- /dev/null
+++ b/25_break_encr/README
@@ -0,0 +1,51 @@
+ In the last problem, you saw an implementation of a simple
+"encryption" program. In this problem, you will write a program
+that breaks that encryption---that is, it will take as input
+a text file encrypted with the encryption program you just
+saw, and prints out the key used to encrypt it!
+
+Breaking the Caesar Cipher uses a technique called "frequency counting."
+This technique relies on the fact that the distribution of letters
+in the English alphabet is far from uniform: 'e' is by far the most
+common letter (~13%), followed by 't' (9%), and 'a' (8%). Note
+that the average frequency is 100/26 ~= 4%.
+
+This frequency distribution means that if you know (or suspect)
+that a file contains English text encrypted with a Caesar Cipher,
+you can simply count the frequency of all letters in it, and guess
+that the letter which occurs most often is 'e'. Once you know
+which letter is 'e', you can backsolve for the key and decrypt
+the file. Note that in practice this requires a large enough
+text that "the law of large numbers" applies---and while it is
+not guaranteed to work, it typically does.
+
+
+Requirements:
+=============
+
+ - Your program will take one command line argument: the name
+ of a file to read as input
+ - Your program will then perform frequency counting on the letters
+ in that text file. Your program should ignore all non-letter
+ characters (those un-modified by your problem 1 program).
+ - Your program should use the frequency count information to
+ determine which letter is 'e', and solve for the key.
+ - On success, your program should print a single line of output to
+ stdout, which should contain one decimal (base 10) integer (followed by a
+ newline character). This number should be the encryption key used
+ on the text. It should be in the range [0,26). That is, the
+ number you print should obey 0 <= answer < 26.
+ - On failure, your program should print an appropriate error
+ message to stderr, then exit with EXIT_FAILURE.
+ - Provide a Makefile which compiles your program into
+ a binary called "breaker"
+Hints:
+ - Divide this problem into sub-problems. You should probably
+ write at least two functions (other than main) to do this.
+ - Think about how you want to represent your data. An array
+ might be handy somewhere.
+ - You may find a function you wrote in a previous classwork
+ useful for this assignment. Feel free to use or adapt it.
+ - Remember that everything (including characters) are numbers.
+ You might look back at the "Read encryption" assignment
+ to see an example of doing math on characters.