The document describes Freenet, a distributed anonymous information storage and retrieval system. Freenet operates as a decentralized peer-to-peer network where nodes can store and retrieve data. It aims to protect anonymity of users and resist censorship of information. Data is stored on the network through a process where requests are routed across nodes based on the data key. This allows for popular data to be replicated across nodes.
This PowerPoint helps students to consider the concept of infinity.
Freenet
1. Anonymous Information
Storage and Retrieval System
Ashraf Uddin
Sujit Singh
South Asian University
(Master of Computer Application)
http://ashrafsau.blogspot.in/
http://ashrafsau.blogspot.in/
2. Introduction
Networked Computer Systems are rapidly
growing.
Current systems offer little user privacy.
Every new data item stored in only one or
few places.
http://ashrafsau.blogspot.in/
3. Freenet
A distributed information storage and retrieval
system.
Privacy concerns.
No central point failures.
Operates as a distributed file system across
many individual computers.
Transparent moving, deleting, replication of data
http://ashrafsau.blogspot.in/
4. Freenet Design Goals
Anonymity for producer and consumer of
information.
Deniability for storers of information.
Resistance to attempts by third parties to deny
access to information.
Efficient Dynamic storage and routing of
information.
Network functions decentralization.
http://ashrafsau.blogspot.in/
5. Roadmap
Architecture
Keys and Searching
Retrieving Data
Storing Data
Managing Data
Adding Nodes
Protocol Details
Performance Analysis
Network Convergence
Scalability
Fault Tolerance
Small World Model
Security
http://ashrafsau.blogspot.in/
6. Architecture ( 1 / 2)
Freenet implemented as an adaptive peer to
peer network of nodes.
Nodes can query each other for information
store or retrieval.
Files named after location independent keys.
Each node maintains :
Shared Datastore
Routing Table of entries ( node address, possible
data keys ).
http://ashrafsau.blogspot.in/
7. Architecture ( 2 / 2)
Requests for keys are passed along from node
to node through a chain of proxy requests.
Routes depend on the key.
Each request is assigned a hops-to-live value.
Each request is assigned a pseudo-unique
random identifier.
Joining to the network requires address
discovering of some nodes.
http://ashrafsau.blogspot.in/
8. Keys And Searching
Freenet data files are identified by binary
file keys.
Binary file keys obtained by 160bit SHA-
1.
Three Types of keys
1. Keyword-Signed Key (KSK)
2. Signed-Subspace Key ( SSK )
3. Content Hash Key ( CHK )
http://ashrafsau.blogspot.in/
9. Keyword-Signed Key (KSK) ( ½)
KSK derived from a descriptive string of the file.
The descriptive string is chosen when storing the
file.
Based on the descriptive string a public/private
key pair is generated.
Public half is hashed to yield the file key.
Private half ensures the match of a retrieved file
– sign of the file.
http://ashrafsau.blogspot.in/
10. Keyword-Signed Key (KSK) (2/2)
The user publishes only the descriptive
string.
Problem : Global namespace. Collisions,
junk file under popular descriptive strings.
The file is encrypted using the descriptive
string as a key.
http://ashrafsau.blogspot.in/
12. Signed-Subspace Key ( SSK ) (1/2)
Attacks global namespace problems.
A user creates a namespace by randomly
generating a public/private key pair.
File insertion based on the private half.
File key generation process
1. Public namespace key and descriptive string
hashed independently
2. XOR’ed together
3. Hash the XOR result.
http://ashrafsau.blogspot.in/
13. Signed-Subspace Key ( SSK ) (2/2)
Private half used to sign the file.
User publishes the descriptive string along
with the subspace’s public key.
Storing/Adding/Updating data requires the
private key.
The file is encrypted using the descriptive
string as a key.
http://ashrafsau.blogspot.in/
15. Content Hash Key ( CHK )
A content hash key is acquired by directly
hashing the contents of the corresponding file.
This assigns a pseudo unique file key.
Files are encrypted using a randomly generated
hash key.
User publishes the content hash key along with
the decryption key.
The decryption key is not stored together with
the file.
http://ashrafsau.blogspot.in/
18. Retrieving Data (1/3)
Downstream node : Node to which a request will
be passed.
Upstream node : Node to which a reply/data
returns.
Process of retrieving data
User initiates a request of the form ( binary file key,
hops-to-live)
The request is send to “his” node.
If found the data is returned with a note indicating
who was the source
http://ashrafsau.blogspot.in/
19. Retrieving Data (2/3)
Continued
If not found, the request is propagated to the next node.
If found in the next node, the data is returned back across the
path established. Data cached on every intervening node.
New route entries are created.
Failures
If downstream node “down”, current node tries it’s second
choice.
If hops-to-live exceeded, failure message returned to the original
requestor.
http://ashrafsau.blogspot.in/
20. Retrieving Data (3/3)
**a request operates as a steepest-ascent
hill-climbing search with backtracking.
http://ashrafsau.blogspot.in/
21. 1. A initiates
A request and asks F
B if it has file
2. B doesn’t so it
12. B sends file asks best-bet peer =
F 3. F doesn’t either and no more nodes to
back to A
ask so returns “request failed” message
B
7. B now detects that it has 4. B tries its second choice D
seen this request before so
returns a “request failed”
message E
11. File sent to B File is Here!
9. D now tries its
second choice E
6. Nor C so forwards request
to B 10. Success!! E
then returns file
back to D who
5. D doesn’t have it so forwards request to C
propagates it
C D back to A
8. C forwards “request failed back to D
http://ashrafsau.blogspot.in/
24. Effects of the data retrieve process
After some “queries” nodes will specialize in few
sets of similar keys. – Similar :
Lexicographically.
Nodes will specialize in storing clusters of files
with similar keys.
Popular data will be transparently replicated
near the “requesting” nodes.
As nodes process requests, new route entries
are created – Connectivity increased.
http://ashrafsau.blogspot.in/
25. Lexicographic closeness = Data
closeness ?
Lexicographic closeness does not imply
descriptive string closeness.
E.g Hash keys AH5JK2, AH5JK3, AH5JK5
will most probably refer to completely
unrelated files.
This scattering was actually intended in
order to attach central points of failures.
http://ashrafsau.blogspot.in/
26. Storing Data ( 1/ 2)
Storing data is similar to the process of retreving
data.
Calculate the binary file key, specify hops-to-live.
Hops-to-live specifies the number of nodes
where the data will be stored.
Nodes accept insert proposals.
If the key is found, the node returns the pre-
existing file to the requestor.
http://ashrafsau.blogspot.in/
27. Storing Data ( 2/ 2)
If key not found, the node propagates the
request to the next route based on key
lexicographic distances.
When hops-to-live reached, a ‘all clear message’
is sent to the original requestor.
The requestor then sends the data to be stored.
This data is cached on every node along the
established path. Also route entries are created.
Same case of failure as with the retrieve
process.
http://ashrafsau.blogspot.in/
29. Effects of the storing Mechanism
1. New files are cached on nodes that have
already stored files with similar keys.
2. Newly added nodes can use the store
mechanism to announce their existence.
3. Attackers that may try to insert junk files
under existing keys will simply spread
the pre-existing files.
http://ashrafsau.blogspot.in/
30. Data Management ( ½)
Finite storage space.
Finite route table space.
Storage managed by LRU.
When a new files comes to be stored and no
space available – LRU entries deleted.
Inconsistency between Storage space and route
tables.
Routing table entries are deleted in the same
fashion.
http://ashrafsau.blogspot.in/
31. Data Management (2/2)
No guarantee for file lifetime.
Nodes can decide to completely drop a
data file.
Encryption of storage files : political – legal
reasons.
http://ashrafsau.blogspot.in/
32. Adding Nodes ( ½)
A new node can join the network by
discovering the address of one or more
existing nodes.
New nodes must “announce” their
existence.
Existing nodes would like to know to which
keys they should assign the new nodes.
http://ashrafsau.blogspot.in/
33. Adding Nodes (2/2)
Process of joining A Freenet System
Candidate node calculates a random seed
Sends a message to an existing node containing it’s
address and the hash of the seed.
The node that accepts this message generates a
seed XORs it with the hash value of the message and
sends it to a randomly chosen node.
When hops-to-live become 0, all nodes reveal their
seeds.
All seeds are XORed to produce the new node’s key.
Each node add an new entry for the new node in its
routing table under the key.
http://ashrafsau.blogspot.in/
34. Freenet Protocol
Based on messages.
Message form
<Transaction id, Hops-To-Live, Depth counter>
Depth counter incremented at every hop.
Used by the replying node to ensure that
the message will reach the requestor.
http://ashrafsau.blogspot.in/
35. Request Data
The requestor sends a Request.Data message including
the search key.
In case of a successful search, the source of the data
responds to the upstream node with a Send.Data
message.
In case of unsuccessful search or hops-to-live
exhausted, Reply.NotFound message is sent.
If the request reached a dead end or loop detected and
HTL not 0 , a Request.Continue message is sent back to
the upstream node containing the remaining HTL.
the remote node may periodically send back
Reply.Restart messages
http://ashrafsau.blogspot.in/
36. Store Data
The requesting node sends a Request.Insert message
which contains the proposed key.
The store message is propagated from node to node
based on route entries.
In case of a collision a Send.Data message or a
Reply.NotFound message is sent back.
If now more nodes can be accessed but there are HTL, a
Request.Continue message is sent.
If HTL become 0 without having encoutered a collision, a
Reply.Insert message is propagated to the upstream
node.
http://ashrafsau.blogspot.in/
37. Performance Analysis
Network Convergence
Scalability
Fault Tolerance
Small World Model
http://ashrafsau.blogspot.in/
38. Network Convergence (1/2)
1000 nodes.50 items data store each and a
routing table of 150 entries.
Each node has routing entries only for his two
closest neighbors.
Random keys were inserted to random nodes.
Every 100 time steps, 300 random requests for
previously inserted files were performed with
HTL=500.
Request path length = Number of hops taken
before finding the data.
http://ashrafsau.blogspot.in/
40. Scalability (1/2)
20 nodes were used initially.
Inserts and requests were performed
randomly as previously.
Every 5 time steps a new node was
created and inserted to the network.
The announcement message was sent to
a randomly chosen node.
http://ashrafsau.blogspot.in/
42. Fault tolerance (1/2)
Network of 1000 nodes.
Progressively removed randomly chosen nodes
to simulate node failures.
Freenet is extremely robust against node
failures.
The median pathlength remains below 20 even
when up to 30% of the nodes have failed.
http://ashrafsau.blogspot.in/
44. Small World Networks Model
The scalability and fault-tolerance characteristics of
Freenet can be explained in terms of a small-world
network model
The majority of the nodes have a few local connections
to other nodes.
Few nodes have large wide ranging connections.
Nodes are well connected – short paths among them.
Small world networks are fault tolerant.
http://ashrafsau.blogspot.in/
45. Is Freenet a small world?
There must be a scale-free power-law
distribution of links within the network.
http://ashrafsau.blogspot.in/
46. Security issues
Primary goal is protecting the anonymity of
both requestors and inserters of data.
Protect the identity of the node that holds
some specific data.
If a malicious user intends to remove a
data file, he is hindered by the anonymity
of the node that holds the file.
http://ashrafsau.blogspot.in/
47. Free net – Prerouting
Freenet Messages are encrypted by a
succession of public keys which determine the
route that message will follow.
Nodes along the route cannot determine either
the originator of the message or its
contents( since encrypted ).
After the end of the prerouting phase, the
message will be inserted into the Freenet
pretending that the endpoint of the preroute was
the originator of the message.
http://ashrafsau.blogspot.in/
48. Data sources Protection
While a node replies to its upstream node
that he is the source of some file, he can
intentionally hide his address.
http://ashrafsau.blogspot.in/
49. Other security concerns
Modification of requested files.
A node steering all the traffic to itself
pretending it owns all the data files.
DoS Attacks.
Attempting to exhaust the storage space.
“pay” a long computation.
Divide datastore to a “new files” section and to a
“established files” section.
http://ashrafsau.blogspot.in/
50. Gnutella
Many similarities exist between Freenet
and Gnutella
Everyone is visible to everyone else as
long as you are ‘online’
Users are split up into groups
Gnutella employs a broadcast search
for files which grows exponentially
http://ashrafsau.blogspot.in/
51. Napster
Napster has a centralized server, which
does not store any data
It coordinates searches of users
Security risk –
Ifcentralized server is shut down, no way
of distributing files
http://ashrafsau.blogspot.in/
52. Conclusions
Effective means of anonymus information
storage and retrieval.
Highly scalable.
http://ashrafsau.blogspot.in/