Stable Matching Problem
Imagine a scenario where there are exactly n boys and exactly n girls. Each boy has a ranking of each girl, from most preferred to least. Likewise, each girl has a ranking of each boy. There are no ties, so that each boy has a list of girls of length n, and each girl has a list of boys of length n.
How should each boy and girl be paired?
A stable solution will be one where all pairs are stable. For a boy-girl pair, (b, g), to be stable, it must be true that b prefers g to every other interested girl, where interested means that she would prefer b to her current partner. If a pair is stable, then there will be no motivation for them to split. It is important to understand that this does not mean that each boy must be paired with his most preferred girl, only that no other girls that he prefers (with respect to his current girl) also prefer him (with respect to her current boy).
A perfect solution will be one where each boy has one, and only one, partner. Since the number of boys and girls are equal, then the converse is also true.
Gale-Shapley Algorithm
Some, what should be easy to read pseudo-code:
while (totalUnpaired(boys) > 0) {
for (each b in boys) {
// choose the highest girl that b has not proposed to
let g = b.nextHighestGirl()
if (g.engagedTo == null || g.engagedTo == b) {
b.engagedTo = g
g.engagedTo = b
} else {
let b' = g.engagedTo
if (g.rank(b') > g.rank(b)) {
// do not ask this girl again
b.forgetGirl(g)
} else {
b.engagedTo = g
g.engagedTo = b
b'.engagedTo = null
}
}
}
}
Gale-Shapley terminates in at most n^2 iterations and it guarantees at least one solution that is both stable and perfect. Furthermore, the algorithm is optimal for the boys. Each boy is paired with his highest possible partner, given a stable pairing. Each girl is paired with her lowest possible partner. This does not mean that each boy is paired with the girl ranked first on his list, but that he will be paired the highest realistic girl, where realistic means in a stable solution. And vice versa for the girls.
Basic Concepts of Probability
Statistics is the collection and analysis of data. Data is a collection of outcomes from a series of random experiments. A random experiment is one with uncertain outcomes. Sample space is the set of all possible outcomes of a random experiment. A random variable is an unknown that can be equal to any value in the sample space. The distribution of a random variable describes the frequencies of that variable. The distribution of a random variable can be displayed graphically via a histogram. The mode is the value of a random variable that has the largest probability. Probability is the relative frequency (f/n) of an outcome when a random experiment is repeated a relatively large number of times. Probability models are used to describe how a random variable is distributed. An event is any subset of a sample space, and is a specific set of outcomes.
Properties of Probability
The probability of any event E must be on the interval [0, 1], so that 0 ≤ P(E) ≤ 1. If S is the sample space, then P(S) = 1. The probability of the union of mutually exclusive events equals the sum of the probability of each event, so that P(A ∪ B ∪ …) = P(A) + P(B) + …
The complementary rule says that P(A`) = 1 - P(A).
The probability of the null event is equal to 0, or P(∅) = 0.
If A ⊂ B, then P(A) ≤ P(B).
The addition rule says that P(A ∪ B) = P(A) + P(B) - P(A ∩ B).
Events are disjoint if they have no common outcomes.
P(A ∪ B ∪ C) = P(A) + P(B) + P(C) - P(A ∩ B) - P(A ∩ C) - P(B ∩ C) + P(A ∩ B ∩ C)
IPv4
The first topic we covered is, IPv4. But first we briefly reviewed the OSI Reference Model, which consists of 7 abstraction layers: Application, Presentation, Session, Transport, Network (including IPv4), Data link, and Physical.
IPv4 uses 4-byte addresses, most commonly seen in the dot-delimited, numerical form: [0-255].[0-255].[0-255].[0-255]. Also, address masks are often used to signify a range of addresses, such that 192.168.1.0/24 is the same as writing 192.168.1.[0-255] and 192.0.0.0/8 is the same as writing 192.[0-255].[0-255].[0-255].
We then looked at some of the weaknesses in IPv4.
Address spoofing can be used to send return traffic to the spoofed address, or to bypass filters to send traffic to the destination. Reverse Path Verification and Egress Filtering can be used to stop some spoofing techniques.
Another concern is packet fragmentation, where the size of a packet may cause the payload to be split up. A packet can be intentionally fragmented to get around simple firewalls. Furthermore, it could be fragmented in such a way that the fragments overlap, so that some fragments contain repeating information. Poor stack implementations can be abused by sending a fragmented packet with a negative or very large offset. Also, a fragment could be missing, causing the packet to never be re-assembled, which multiplied on a grand scale could keep a resource busy waiting for the final fragment of many packets at once.
ARP cache poisoning can be used to perpetrate man-in-the-middle kinds of attacks. Encrypting all traffic could be at least a partial solution to this, depending on whether the session key is kept secret.
The spring semester has started and I have decided to try something new. I will be posting my class notes to this site after each class. There are two reasons I am doing this here, as opposed to just keeping private notes.
- I will actually get to use this space on a regular basis.
- I will be motivated to continue the process once I begin.
Look forward to that.