Commit Graph

3 Commits

Author SHA1 Message Date
Donovan Baarda
fcc4cbc3f3 Change fastcdc to a better and simpler algorithm. (#79)
This CL changes the chunking algorithm from "normalized chunking" to
simple "regression chunking", and changes the has criteria from
'hash&mask' to 'hash<=threshold'. These are all ideas taken from
testing and analysis done at
  https://github.com/dbaarda/rollsum-chunking/blob/master/RESULTS.rst
Regression chunking was introduced in
  https://www.usenix.org/system/files/conference/atc12/atc12-final293.pdf

The algorithm uses an arbitrary number of regressions using power-of-2
regression target lengths. This means we can use a simple bitmask for
the regression hash criteria.

Regression chunking yields high deduplication rates even for lower max
chunk sizes, so that the cdc_stream max chunk can be reduced to 512K
from 1024K. This fixes potential latency spikes from large chunks.
2023-02-08 15:06:41 +01:00
Lutz Justen
c21503d21b [cdc_stream] Fix issues found in tests (#40)
* [cdc_stream] Fix issues found in tests

Fixes a couple of issues found by integration testing:
- Unicode command line args in cdc_stream show up as question marks.
- Log is still named assets_stream_manager instead of cdc_stream.
- An error message contains stadia_assets_stream_manager_v3.exe.
- mount_dir was not the last arg as required by FUSE
- Promoted cache cleanup logs to INFO level since they're important
  for the proper workings of the system.
- Asset streaming cache dir is still %APPDATA%\GGP\asset_streaming.

* Address comments
2022-12-07 11:25:43 +01:00
Lutz Justen
01a60e2490 Rename asset_stream_manager to cdc_stream (#27)
Fixes #13
2022-12-01 16:14:56 +01:00