Improves ServerArch so that it can detect the remote architecture by
running uname and checking %PROCESSOR_ARCHITECTURE%. So far, only
x64 Linux and x64 Windows are supported, but in the future it is easy
to add support for others, e.g. aarch64, as well.
Before the detection is run, the remote architecture is guessed first
based on the destination. For instance, if the destination directory
starts with "C:\", it pretty much means Windows. If cdc_rsync_server
exists and runs fine, there's no need for detection.
Since also PortManager depends on the remote architecture, it has to
be adjusted as well. So far, PortManager assumeed that "local" means
Windows and "remote" means Linux. This is no longer the case for
syncing to Windows devices, so this CL adds the necessary abstractions
to PortManager.
Also refactors ArchType into a separate class in common, since it is
used now from several places. It is also expanded to handle future
changes that add support for different processor architectures, e.g.
aarch64.
Also clarifies some unclear aspects in the readme, and adds a fix that
allows create_release.yml to be used for pull requests for testing.
Fixes#67Fixes#55
On Windows, fclose() seems to be very expensive for large files, where
closing a 1 GB file takes up to 5 seconds. This CL calls fclose() in
background threads. This tremendously improves local syncs, e.g.
copying a 4.5 GB, 300 files data set takes only 7 seconds instead of
30 seconds.
Also increases the buffer size for copying from 16K to 128K (better
throughput for local copies), and adds a timestamp to debug and
verbose console logs (useful when comparing client and server logs).
Build id is an optional unique identifier specified during cdc_rsync build via CDC_BUILD_VERSION definition.
If build id specified on both client and server components it will be used to check the version of server component instead of file size + modified time.
Adds a function to filter ANSI escape sequences from a string.
Executing SSH commands on Windows yields output that is full of ANSI
escape sequences if the "-tt" (forced TTY) argument is used. One
particular escape sequence sets the window title to
"c:\windows\system32\cmd.exe". This string is null terminated and
messes with parsing the actual output later in that string.
The filter function removes those escape sequences.
The outout is still a bit messed up, even after removing escape
sequences. Some sequences delete rows and move the cursor. Without
properly interpreting these sequences it doesn't seem possible to
retrieve the proper output.
In a future CL the -tt argument is removed on Windows, which removes
the necessity to filter ANSI codes. However, sometimes the target
architecture is not known (yet), so that it is still useful to filter
ANSI codes in that case to print useful debug output.
Adds support for local syncs of files and folders on the same Windows
machine, e.g. cdc_rsync C:\source C:\dest. The two main changes are
- Skip the check whether the port is available remotely with PortManager.
- Do not deploy cdc_rsync_server.
- Run cdc_rsync_server directly, not through an SSH tunnel.
The current implementation is not optimal as it starts
cdc_rsync_server as a separate process and communicates to it via a
TCP port.
* Fix#76 fastcdc chunk boundary off-by-one.
This ensures that the last byte included in the gear-hash that identified the
chunk boundary is included in the chunk. This ensures chunks are still matched
when the byte immediately after them is changed.
* Init gear hash to all 1's to prevent zero-length chunks with min_size=0.
Also change the `MaxChunkSize` test to use min_size=0 to test this works.
Use sftp for deploying remote components instead of scp. sftp has the
advantage that it can also create directries, chmod files etc., so
that we can do everything in one call of sftp instead of mixing scp
and ssh calls.
The downside of sftp is that it can't switch to ~ resp. %userprofile%
for the remote side, and we have to assume that sftp starts in the
user's home dir. This is the default and works on my machines!
cdc_rsync and cdc_stream check the CDC_SFTP_COMMAND env var now and
accept --sftp-command flags. If they are not set, the corresponding
scp flag and env var is still used, with scp replaced by sftp. This is
most likely correct as sftp and scp usually reside in the same
directory and share largely identical parameters.
Adds a ServerArch class whose job it is to encapsulate differences
between Windows and Linux cdc_rsync_servers. It detects the type
based on a heuristic in the destination path. This is not fool proof
and will probably require further work, like falling back to the other
type if the detected one doesn't work.
Uses the ServerArch class to determine the different commands to start
the server and to deploy the server.
Note that the functionality is not well tested on Windows yet, but
copying plain files works.
In a future CL, we will switch from scp to sftp. This CL adds support
for calling sftp from RemoteUtil.
In order to maintain backwards compatibility where people still set
--scp-command or CDC_SCP_COMMAND instead of the sftp versions, this CL
also adds the helper method RemoteUtil::ScpToSftpCommand, which
attempts to convert an scp command to an sftp command. This is usually
possible since the args are almost the same. For instance, if the scp
command is
C:\path\to\scp.exe -P 1234 -i <key_file> -oUserKnownHostsFile=known_hosts
then the corresponding sftp command is most likely
C:\path\to\sftp.exe -P 1234 -i <key_file> -oUserKnownHostsFile=known_hosts
This works for instance for OpenSSH.
This will be needed later for switching to sftp, since calling lcd in
sftp is tricky to get right (e.g. may or may not require /cygwin/c on
Windows, depending on whether sftp is native or not).
Fixes an issue in UnzstdStream where the Read() method always tries to
read new input data if no input data is available, instead of first
trying to uncompress. Since zstd maintains internal buffers,
uncompression might succeed even without reading more input, so this
is faster. This bug can lead to pipeline stalls in cdc_rsync.
But...
ONCE AND FOR ALL!
A recent change introduced WaitForWatching(), which was supposed to
block until the file watcher is actively monitoring the directory.
However this always returned immediately since the watcher is in
kFailed state if the directory was deleted, which counts as watching
(IsStarted returns true for both kWatching and kFailed states).
This CL adds an IsWatching() helper function that returns true only for
the kWatching state, which means that the directory is actively being
watched.
Makes ServerSocket multi-platform, mainly by working around some small
API differences. The code is largely the same, there should be no
differences on Linux.
Also moves WSAStartup() and WSACleanup() up to the Socket level as
static methods because it's used by both ClientSocket and ServerSocket,
and because it doesn't make sense to do that in the socket class as
that would prevent one from using several sockets.
Adds a flag to set the SSH forwarding port or port range used for
'cdc_stream start-service' and 'cdc_rsync'.
If a single number is passed, e.g. --forward-port 12345, then this
port is used without checking availability of local and remote ports.
If the port is taken, this results in an error when trying to connect.
Note that this restricts the number of connections that stream can
make to one.
If a range is passed, e.g. --forward-port 45000-46000, the tools
search for available ports locally and remotely in that range. This is
more robust, but a bit slower due to the extra overhead.
Optimizes port_manager_win as it was very slow for a large port range.
It's still not optimal, but the time needed to scan 30k ports is
<< 1 seconds now.
Fixes#12
This CL removes the port arguments for both tools.
The port argument can also be specified via the ssh-command and
scp-command flags. In fact, if a port is specified by both port flags
and ssh/scp commands, they interfere with each other. For ssh, the one
specified in ssh-command wins. For scp, the one specified in
scp-command wins. To fix this, one would have to parse scp-command and
remove the port arg there. Or we could just remove the ssh-port arg.
This is what this CL does. Note that if you need a custom port, it's
very likely that you also have to define custom ssh and scp commands.
This CL adds Python integration tests for cdc_stream. To run the
tests, you need to supply a Linux host and proper configuration for
cdc_stream to work:
set CDC_SSH_COMMAND=C:\path\to\ssh.exe <args>
set CDC_SCP_COMMAND=C:\path\to\scp.exe <args>
C:\python38\python.exe -m integration_tests.cdc_stream.all_tests --binary_path=C:\full\path\to\cdc_stream.exe --user_host=user@host
Ran the tests and made sure they worked.
[cdc_rsync] Add integration tests
This CL adds Python integration tests for cdc_rsync. To run the tests,
you need to supply a Linux host and proper configuration for cdc_rsync
to work:
set CDC_SSH_COMMAND=C:\path\to\ssh.exe <args>
set CDC_SCP_COMMAND=C:\path\to\scp.exe <args>
C:\python38\python.exe -m integration_tests.cdc_rsync.all_tests --binary_path=C:\full\path\to\cdc_rsync.exe --user_host=user@host
Ran the tests and made sure they worked.
There were two problems:
- Writing the date on Windows used the wrong syntax. In Powershell,
env variables are addressed as $env:NAME, not $NAME.
- Use different caches for opt vs fastbuild. We are currently using
opt caches for fastbuilds, which results in lots of cache misses.
* [cdc_stream] Fix issues found in tests
Fixes a couple of issues found by integration testing:
- Unicode command line args in cdc_stream show up as question marks.
- Log is still named assets_stream_manager instead of cdc_stream.
- An error message contains stadia_assets_stream_manager_v3.exe.
- mount_dir was not the last arg as required by FUSE
- Promoted cache cleanup logs to INFO level since they're important
for the proper workings of the system.
- Asset streaming cache dir is still %APPDATA%\GGP\asset_streaming.
* Address comments
Uses a bazel --disk_cache to cache build outputs between builds. Bazel
also has a local cache, e.g. in ~/.cache/bazel/_bazel_$USER/cache, but
that one can't be used as it won't reuse data across checkouts. A disk
cache is like a remote cache, except that it's on the local disk.
Github first looks for a cache with the given exact key in the current
branch, then in the main branch. If there's a cache hit, the cache
isn't updated (they're read-only!). To prevent that caches become
stale, they are timestamped using the current year and month, so that
the cache is force-renewed every month. Bazel disk caches also just
grow, so this technique prevents that the cache grows indefinitely,
eventually causing cache trashing.
Implements cdc_stream stop-service. Also fixes an issue in the
BackgroundService implementation where Exit() would deadlock since
server shutdown waits for all RPCs to exit.
Starts the streaming service if it's not up and running. This required
adding the ability to run a detached process. By default, all child
processes are killed when the parent process exits. Since detached
child processes don't run with a console, they need to create sub-
processes with CREATE_NO_WINDOW since otherwise a new console pops up,
e.g. for every ssh command.
Polls for 20 seconds while the service starts up. For this purpose,
a BackgroundServiceClient is added. This will be reused in a future CL
by a new stop-service command to exit the service.
Also adds --service-port as additional argument to start-service.
There is a race condition in RecreateWatchedDir where there was a
brief period between the second dir change event and when the file
watcher was actually watching again. If the file was written during
that bried period, it would be missed. The issue could be reproduced
easily by adding a sleep here:
// The watched directory exists and its handle is valid.
if (!first_run) {
++dir_recreate_count_;
if (dir_recreated_cb_) dir_recreated_cb_();
Util::Sleep(1);
}
This CL waits until the watcher is watching again.
Switch asset_stream_manager to use Lyra
Lyra has a nice simple interface, but a few quirks that we work
around, mainly in the BaseCommand class:
- It does not support return values from running a command.
- It does not support return values from a custom arg parser.
- Lyra interprets --bad_arg as positional argument.
Fixes#15
The issue was consistently reproducible by adding a sleep right after starting the process.
Use ping instead of timeout now, because ping doesn't read user input.
The test
FileWatcherTest/FileWatcherParameterizedTest.RecreateWatchedDir/ReadDirectoryChangesExW
is flaky. This CL doesn't fix the root cause, but it fixes the
indefinite spin in GetChangedFiles when there is no file change.
Expand path variables for sync destination
Running commands like cdc_rsync C:\assets\* host:~/assets -vr would create a directory called ~assets. This CL expands path variables properly.
Modifies the create_release workflow in 2 ways:
- It only runs now if something is pushed to main.
- It creates a tagged release if a tag is pushed.
To create a tagged release, run e.g.
git tag -a v0.1.0 -m "Release 0.1.0"
git push origin v0.1.0
Improve readme
This CL adds
- a history section with references to Stadia
- benchmarks
- animated gifs with demos
- a troubleshooting section
- and more info about cdc_stream
* Add a Github action for building and testing
On Windows, -- -//third_party/... doesn't seem to work, so add all test directories manually. Also run the tests_*. We run only fastbuild tests here, since the opt tests will be run in the release workflow.
Also fix a number of compilation and test issues found along the way.
So far, errors from the remote netstat process would only be logged in
the asset stream service, for instance when SSH auth failed. However,
the errors were not shown to the client, and that's the most important
thing.
Also adds some feedback to cdc_stream in case of success.
Implements the cdc_stream client and adjusts asset streaming in
various places to work better outside of a GGP environment.
This CL tries to get quoting for SSH commands right. It also brings
back the ability to start a streaming session from
asset_stream_manager.
Also cleans up Bazel targets setup. Since the sln file is now in root,
it is no longer necessary to prepend ../ to relative filenames to
make clicking on errors work.
Fixes a couple of issues with the FUSE:
- Creates the mount directory if it does not exist.
This assumes the mount dir to be the last arg. Ideally, we'd parse the
command line and then create the directory, but unfortunately
fuse_parse_cmdline already verifies that the dir exists.
- Expands the cache_dir (e.g. ~).
- Fixes a compile issue in manifest_iterator.