Hi there,
We have been trying to set up URL filtering for our app but have run into a wall with generating the bloom filter.
Firstly, some context about our set up:
OHTTP handlers
Uses pre-warmed lambdas to expose the gateway and the configs endpoints using the javascript libary referenced here - https://developers.cloudflare.com/privacy-gateway/get-started/#resources
Status = untested
We have not yet got access to Apples relay servers
PIR service
We run the PIR service through AWS ECS behind an ALB
The container clones the following repo https://github.com/apple/swift-homomorphic-encryption, outside of config changes, we do not have any custom functionality
Status = working
From the logs, everything seems to be working here because it is responding to queries when they are sent, and never blocking anything it shouldn’t
Bloom filter generation
We generate a bloom filter from the following url list:
https://example.com
http://example.com
example.com
Then we put the result into the url filtering example application from here - https://developer.apple.com/documentation/networkextension/filtering-traffic-by-url
The info generated from the above URLs is:
{
"bits": 44,
"hashes": 11,
"seed": 2538058380,
"content": "m+yLyZ4O"
}
Status = broken
We think this is broken because we are getting requests to our PIR server for every single website we visit
We would have expected to only receive requests to the PIR server when going to example.com because it’s in our block list
It’s possible that behind the scenes Apple runs sporadically makes requests regardless of the bloom filter result, but that isn’t what we’d expect
We are generating our bloom filter in the following way:
We double hash the URL using fnv1a for the first, and murmurhash3 for the second
hashTwice(value: any, seed?: any): any {
return {
first: Number(fnv1a(value, { size: 32 })),
second: murmurhash3(value, seed),
};
}
We calculate the index positions from the following function/formula , as seen in https://github.com/ameshkov/swift-bloom/blob/master/Sources/BloomFilter/BloomFilter.swift#L96
doubleHashing(n: number, hashA: number, hashB: number, size: number): number {
return Math.abs((hashA + n * hashB) % size);
}
Questions:
What hashing algorithms are used and can you link an implementation that you know is compatible with Apple’s?
How are the index positions calculated from the iteration number, the size, and the hash results?
There was mention of a tool for generating a bloom filter that could be used for Apple’s URL filtering implementation, when can we expect the release of this tool?
Selecting any option will automatically load the page