persistent storage tailored for containers
TRANSCRIPT
![Page 1: Persistent storage tailored for containers](https://reader031.vdocuments.net/reader031/viewer/2022022203/586e8c211a28aba0038b8327/html5/thumbnails/1.jpg)
Persistent storage tailored for containers
Quentin “mefyl” [email protected]
CTO @ Infinit
Version 1.2-26-gbcb3c69
![Page 2: Persistent storage tailored for containers](https://reader031.vdocuments.net/reader031/viewer/2022022203/586e8c211a28aba0038b8327/html5/thumbnails/2.jpg)
Plan
Containers and persistent storage
Infinit storage platform
Dive-in
Demo
Q&A
![Page 3: Persistent storage tailored for containers](https://reader031.vdocuments.net/reader031/viewer/2022022203/586e8c211a28aba0038b8327/html5/thumbnails/3.jpg)
Containers and persistent storage
Containers are fast, scalable and flexible.
![Page 4: Persistent storage tailored for containers](https://reader031.vdocuments.net/reader031/viewer/2022022203/586e8c211a28aba0038b8327/html5/thumbnails/4.jpg)
Containers and persistent storage
Containers are fast, scalable and flexible.
• Fast and easy to start and stop.
![Page 5: Persistent storage tailored for containers](https://reader031.vdocuments.net/reader031/viewer/2022022203/586e8c211a28aba0038b8327/html5/thumbnails/5.jpg)
Containers and persistent storage
Containers are fast, scalable and flexible.
• Fast and easy to start and stop.• Fast and easy to scale.
![Page 6: Persistent storage tailored for containers](https://reader031.vdocuments.net/reader031/viewer/2022022203/586e8c211a28aba0038b8327/html5/thumbnails/6.jpg)
Containers and persistent storage
Containers are fast, scalable and flexible.
• Fast and easy to start and stop.• Fast and easy to scale.• Unified from development to production.
![Page 7: Persistent storage tailored for containers](https://reader031.vdocuments.net/reader031/viewer/2022022203/586e8c211a28aba0038b8327/html5/thumbnails/7.jpg)
Containers and persistent storage
Containers are fast, scalable and flexible.
• Fast and easy to start and stop.• Fast and easy to scale.• Unified from development to production.• Yet customizable for every situation.
![Page 8: Persistent storage tailored for containers](https://reader031.vdocuments.net/reader031/viewer/2022022203/586e8c211a28aba0038b8327/html5/thumbnails/8.jpg)
Containers and persistent storage
Containers are fast, scalable and flexible.
• Fast and easy to start and stop.• Fast and easy to scale.• Unified from development to production.• Yet customizable for every situation.
However containers tend to be stateless, which can be quite limiting. We need persistent storage for containers.
![Page 9: Persistent storage tailored for containers](https://reader031.vdocuments.net/reader031/viewer/2022022203/586e8c211a28aba0038b8327/html5/thumbnails/9.jpg)
Containers and persistent storage
Containers are fast, scalable and flexible.
• Fast and easy to start and stop.• Fast and easy to scale.• Unified from development to production.• Yet customizable for every situation.
However containers tend to be stateless, which can be quite limiting. We need persistent storage for containers.
• It should be created and started as easily as a container.
![Page 10: Persistent storage tailored for containers](https://reader031.vdocuments.net/reader031/viewer/2022022203/586e8c211a28aba0038b8327/html5/thumbnails/10.jpg)
Containers and persistent storage
Containers are fast, scalable and flexible.
• Fast and easy to start and stop.• Fast and easy to scale.• Unified from development to production.• Yet customizable for every situation.
However containers tend to be stateless, which can be quite limiting. We need persistent storage for containers.
• It should be created and started as easily as a container.• It should be able to scale with your container pool.
![Page 11: Persistent storage tailored for containers](https://reader031.vdocuments.net/reader031/viewer/2022022203/586e8c211a28aba0038b8327/html5/thumbnails/11.jpg)
Containers and persistent storage
Containers are fast, scalable and flexible.
• Fast and easy to start and stop.• Fast and easy to scale.• Unified from development to production.• Yet customizable for every situation.
However containers tend to be stateless, which can be quite limiting. We need persistent storage for containers.
• It should be created and started as easily as a container.• It should be able to scale with your container pool.• It should work the same way for development, tests, production, …
![Page 12: Persistent storage tailored for containers](https://reader031.vdocuments.net/reader031/viewer/2022022203/586e8c211a28aba0038b8327/html5/thumbnails/12.jpg)
Containers and persistent storage
Containers are fast, scalable and flexible.
• Fast and easy to start and stop.• Fast and easy to scale.• Unified from development to production.• Yet customizable for every situation.
However containers tend to be stateless, which can be quite limiting. We need persistent storage for containers.
• It should be created and started as easily as a container.• It should be able to scale with your container pool.• It should work the same way for development, tests, production, …• It should adapt to all situations.
![Page 13: Persistent storage tailored for containers](https://reader031.vdocuments.net/reader031/viewer/2022022203/586e8c211a28aba0038b8327/html5/thumbnails/13.jpg)
Infinit storage platform
Infinit is a storage platform designed with containers in mind providing several APIs: POSIX filesystem, object,block. It aggregates local nodes storage into a single virtual pool.
![Page 14: Persistent storage tailored for containers](https://reader031.vdocuments.net/reader031/viewer/2022022203/586e8c211a28aba0038b8327/html5/thumbnails/14.jpg)
Infinit storage platform
The Infinit platform is truly distributed. There are no leader or followers, all nodes are equal.
![Page 15: Persistent storage tailored for containers](https://reader031.vdocuments.net/reader031/viewer/2022022203/586e8c211a28aba0038b8327/html5/thumbnails/15.jpg)
Infinit storage platform
The Infinit platform is truly distributed. There are no leader or followers, all nodes are equal.
• Works the same with 1 or 10k nodes.
![Page 16: Persistent storage tailored for containers](https://reader031.vdocuments.net/reader031/viewer/2022022203/586e8c211a28aba0038b8327/html5/thumbnails/16.jpg)
Infinit storage platform
The Infinit platform is truly distributed. There are no leader or followers, all nodes are equal.
• Works the same with 1 or 10k nodes.• No node is a point of failure, no bottleneck.
![Page 17: Persistent storage tailored for containers](https://reader031.vdocuments.net/reader031/viewer/2022022203/586e8c211a28aba0038b8327/html5/thumbnails/17.jpg)
Infinit storage platform
The Infinit platform is truly distributed. There are no leader or followers, all nodes are equal.
• Works the same with 1 or 10k nodes.• No node is a point of failure, no bottleneck.• Nodes can come and go: scaling in and out is easy, both capacity and throughput.
![Page 18: Persistent storage tailored for containers](https://reader031.vdocuments.net/reader031/viewer/2022022203/586e8c211a28aba0038b8327/html5/thumbnails/18.jpg)
Infinit storage platform
The Infinit platform is truly distributed. There are no leader or followers, all nodes are equal.
• Works the same with 1 or 10k nodes.• No node is a point of failure, no bottleneck.• Nodes can come and go: scaling in and out is easy, both capacity and throughput.• Uniform API from development to production.
![Page 19: Persistent storage tailored for containers](https://reader031.vdocuments.net/reader031/viewer/2022022203/586e8c211a28aba0038b8327/html5/thumbnails/19.jpg)
Infinit storage platform
The Infinit platform is truly distributed. There are no leader or followers, all nodes are equal.
• Works the same with 1 or 10k nodes.• No node is a point of failure, no bottleneck.• Nodes can come and go: scaling in and out is easy, both capacity and throughput.• Uniform API from development to production.• Customizable for every setup.
![Page 20: Persistent storage tailored for containers](https://reader031.vdocuments.net/reader031/viewer/2022022203/586e8c211a28aba0038b8327/html5/thumbnails/20.jpg)
Infinit storage platform
The Infinit platform is truly distributed. There are no leader or followers, all nodes are equal.
• Works the same with 1 or 10k nodes.• No node is a point of failure, no bottleneck.• Nodes can come and go: scaling in and out is easy, both capacity and throughput.• Uniform API from development to production.• Customizable for every setup.
Thus, Infinit:
• Can be created and run as seamlessly as a container.
![Page 21: Persistent storage tailored for containers](https://reader031.vdocuments.net/reader031/viewer/2022022203/586e8c211a28aba0038b8327/html5/thumbnails/21.jpg)
Infinit storage platform
The Infinit platform is truly distributed. There are no leader or followers, all nodes are equal.
• Works the same with 1 or 10k nodes.• No node is a point of failure, no bottleneck.• Nodes can come and go: scaling in and out is easy, both capacity and throughput.• Uniform API from development to production.• Customizable for every setup.
Thus, Infinit:
• Can be created and run as seamlessly as a container.• Can scale with you container pool.
![Page 22: Persistent storage tailored for containers](https://reader031.vdocuments.net/reader031/viewer/2022022203/586e8c211a28aba0038b8327/html5/thumbnails/22.jpg)
Infinit storage platform
The Infinit platform is truly distributed. There are no leader or followers, all nodes are equal.
• Works the same with 1 or 10k nodes.• No node is a point of failure, no bottleneck.• Nodes can come and go: scaling in and out is easy, both capacity and throughput.• Uniform API from development to production.• Customizable for every setup.
Thus, Infinit:
• Can be created and run as seamlessly as a container.• Can scale with you container pool.• Is the same in all situations: development, unit tests, production …
![Page 23: Persistent storage tailored for containers](https://reader031.vdocuments.net/reader031/viewer/2022022203/586e8c211a28aba0038b8327/html5/thumbnails/23.jpg)
Infinit storage platform
The Infinit platform is truly distributed. There are no leader or followers, all nodes are equal.
• Works the same with 1 or 10k nodes.• No node is a point of failure, no bottleneck.• Nodes can come and go: scaling in and out is easy, both capacity and throughput.• Uniform API from development to production.• Customizable for every setup.
Thus, Infinit:
• Can be created and run as seamlessly as a container.• Can scale with you container pool.• Is the same in all situations: development, unit tests, production …• Can be configured for each situation: encryption, redundancy, compression, …
![Page 24: Persistent storage tailored for containers](https://reader031.vdocuments.net/reader031/viewer/2022022203/586e8c211a28aba0038b8327/html5/thumbnails/24.jpg)
How to achieve this
Infinit fundamental principles:
![Page 25: Persistent storage tailored for containers](https://reader031.vdocuments.net/reader031/viewer/2022022203/586e8c211a28aba0038b8327/html5/thumbnails/25.jpg)
How to achieve this
Infinit fundamental principles:
• Federate all nodes in an overlay network for lookup and routing.
![Page 26: Persistent storage tailored for containers](https://reader031.vdocuments.net/reader031/viewer/2022022203/586e8c211a28aba0038b8327/html5/thumbnails/26.jpg)
How to achieve this
Infinit fundamental principles:
• Federate all nodes in an overlay network for lookup and routing.• Store data as blocks in a distributed hashtable (key-value store) with a per-block consensus.
![Page 27: Persistent storage tailored for containers](https://reader031.vdocuments.net/reader031/viewer/2022022203/586e8c211a28aba0038b8327/html5/thumbnails/27.jpg)
How to achieve this
Infinit fundamental principles:
• Federate all nodes in an overlay network for lookup and routing.• Store data as blocks in a distributed hashtable (key-value store) with a per-block consensus.• Use cryptographic access control to dispense from any leader.
![Page 28: Persistent storage tailored for containers](https://reader031.vdocuments.net/reader031/viewer/2022022203/586e8c211a28aba0038b8327/html5/thumbnails/28.jpg)
How to achieve this
Infinit fundamental principles:
• Federate all nodes in an overlay network for lookup and routing.• Store data as blocks in a distributed hashtable (key-value store) with a per-block consensus.• Use cryptographic access control to dispense from any leader.• Use symmetrical operations to ensure resilience and flexibility.
![Page 29: Persistent storage tailored for containers](https://reader031.vdocuments.net/reader031/viewer/2022022203/586e8c211a28aba0038b8327/html5/thumbnails/29.jpg)
![Page 30: Persistent storage tailored for containers](https://reader031.vdocuments.net/reader031/viewer/2022022203/586e8c211a28aba0038b8327/html5/thumbnails/30.jpg)
Dive-in: DHT blocks
Because we are symetric and use cryptographic access control, all blocks must be self-certified and ciphered.
![Page 31: Persistent storage tailored for containers](https://reader031.vdocuments.net/reader031/viewer/2022022203/586e8c211a28aba0038b8327/html5/thumbnails/31.jpg)
Dive-in: DHT blocks
Because we are symetric and use cryptographic access control, all blocks must be self-certified and ciphered.
• Mutable blocks◦ Subject to conflicts.◦ Subject to invalidation.◦ Hard to certify and cipher.
![Page 32: Persistent storage tailored for containers](https://reader031.vdocuments.net/reader031/viewer/2022022203/586e8c211a28aba0038b8327/html5/thumbnails/32.jpg)
Dive-in: DHT blocks
Because we are symetric and use cryptographic access control, all blocks must be self-certified and ciphered.
• Mutable blocks◦ Subject to conflicts.◦ Subject to invalidation.◦ Hard to certify and cipher.
• Immutable blocks◦ No conflicts.◦ No invalidation: cachable forever.◦ Easy to certify since content addressable: address = hash(contents) .
![Page 33: Persistent storage tailored for containers](https://reader031.vdocuments.net/reader031/viewer/2022022203/586e8c211a28aba0038b8327/html5/thumbnails/33.jpg)
Dive-in: DHT blocks
Because we are symetric and use cryptographic access control, all blocks must be self-certified and ciphered.
• Mutable blocks◦ Subject to conflicts.◦ Subject to invalidation.◦ Hard to certify and cipher.
• Immutable blocks◦ No conflicts.◦ No invalidation: cachable forever.◦ Easy to certify since content addressable: address = hash(contents) .
Immutable block are fetchable from any source with permanent on-disk LRU cache.
![Page 34: Persistent storage tailored for containers](https://reader031.vdocuments.net/reader031/viewer/2022022203/586e8c211a28aba0038b8327/html5/thumbnails/34.jpg)
A file is mostly a mutable block withmetadata and a FAT of immutableblock.
Dive-in: filesystem layer
![Page 35: Persistent storage tailored for containers](https://reader031.vdocuments.net/reader031/viewer/2022022203/586e8c211a28aba0038b8327/html5/thumbnails/35.jpg)
A file is mostly a mutable block withmetadata and a FAT of immutableblock.
Dive-in: filesystem layer
File contents is cachable at will, cheapand atomic writes.
![Page 36: Persistent storage tailored for containers](https://reader031.vdocuments.net/reader031/viewer/2022022203/586e8c211a28aba0038b8327/html5/thumbnails/36.jpg)
Dive-in: filesystem layer
The POSIX API is inherently sequential. We are highly parallel.
![Page 37: Persistent storage tailored for containers](https://reader031.vdocuments.net/reader031/viewer/2022022203/586e8c211a28aba0038b8327/html5/thumbnails/37.jpg)
Dive-in: filesystem layer
Directories prefetching and files look-ahead enables batching and pipelining.
![Page 38: Persistent storage tailored for containers](https://reader031.vdocuments.net/reader031/viewer/2022022203/586e8c211a28aba0038b8327/html5/thumbnails/38.jpg)
Dive-in: consensus
Each block is managed by a specific quorum of node with a variable composition, running multipaxos.
![Page 39: Persistent storage tailored for containers](https://reader031.vdocuments.net/reader031/viewer/2022022203/586e8c211a28aba0038b8327/html5/thumbnails/39.jpg)
Dive-in: consensus
Each block is managed by a specific quorum of node with a variable composition, running multipaxos.
No failure point or bottleneck, strong read after write consistency.
![Page 40: Persistent storage tailored for containers](https://reader031.vdocuments.net/reader031/viewer/2022022203/586e8c211a28aba0038b8327/html5/thumbnails/40.jpg)
Dive-in: overlay
The overlay layer is one major customization point of the platform.
![Page 41: Persistent storage tailored for containers](https://reader031.vdocuments.net/reader031/viewer/2022022203/586e8c211a28aba0038b8327/html5/thumbnails/41.jpg)
Dive-in: overlay
The overlay layer is one major customization point of the platform.
Algorithm choice:
• Several thousands machines: kelips, kademlia, chord.• Few hundreds machines and dozen of terabytes: global knowledge.
![Page 42: Persistent storage tailored for containers](https://reader031.vdocuments.net/reader031/viewer/2022022203/586e8c211a28aba0038b8327/html5/thumbnails/42.jpg)
Dive-in: overlay
The overlay layer is one major customization point of the platform.
Algorithm choice:
• Several thousands machines: kelips, kademlia, chord.• Few hundreds machines and dozen of terabytes: global knowledge.
Data placement: rack-aware, zone-aware, reliability-aware, ensure local copies, ...
![Page 43: Persistent storage tailored for containers](https://reader031.vdocuments.net/reader031/viewer/2022022203/586e8c211a28aba0038b8327/html5/thumbnails/43.jpg)
Demo!
Let's persist that storage!