Dvc integration

Hello,

My site name is upscalerjs.com.

I’m using a tool called dvc as part of my build script (think Git LFS). DVC hosts files on Google Drive, for which it requires a service account JSON file.

Here is my build script:

pip install --upgrade pip && pip install dvc[all] && echo '>>>>>>>>' && GDRIVE_CREDENTIALS_DATA=$GDRIVE_CREDENTIALS_DATA dvc pull -v -r gdrive-service-account && npx pnpm i --store=node_modules/.pnpm-store && npx pnpm run build

The script fails at dvc pull. It seems that I am passing the $GDRIVE_CREDENTIALS_DATA incorrectly.

I’ve confirmed that I have a GDRIVE_CREDENTIALS_DATA environmental variable, and that the JSON string is encoded correctly (though I will note that, if I set the variable to foo and echo it out, I see foo; if I set the variable to the JSON string, it simply prints >>>>. I assume Netlify is doing some obfuscation here.)

For comparison, here is a similar implementation in GitHub Actions that works.

Relevant bits of the build log:

4:07:12 AM: >>>>>>>>
4:07:12 AM: 2023-02-28 09:07:12,409 DEBUG: v2.45.1 (pip), CPython 3.8.10 on Linux-5.4.228-131.415.amzn2.x86_64-x86_64-with-glibc2.29
4:07:12 AM: 2023-02-28 09:07:12,409 DEBUG: command: /opt/buildhome/python3.8/bin/dvc pull -v -r gdrive-service-account
4:07:12 AM: 2023-02-28 09:07:12,725 DEBUG: Preparing to transfer data from '1tGm1wnv7pAhbSuy4u9Ci61xA8WD3DXXr' to '/opt/build/repo/.dvc/cache'
4:07:12 AM: 2023-02-28 09:07:12,725 DEBUG: Preparing to collect status from '/opt/build/repo/.dvc/cache'
4:07:12 AM: 2023-02-28 09:07:12,725 DEBUG: Collecting status from '/opt/build/repo/.dvc/cache'
4:07:12 AM: 2023-02-28 09:07:12,726 DEBUG: Preparing to collect status from '1tGm1wnv7pAhbSuy4u9Ci61xA8WD3DXXr'
4:07:12 AM: 2023-02-28 09:07:12,726 DEBUG: Collecting status from '1tGm1wnv7pAhbSuy4u9Ci61xA8WD3DXXr'
4:07:12 AM: 2023-02-28 09:07:12,726 DEBUG: Querying 1 oids via object_exists
4:07:12 AM: 2023-02-28 09:07:12,875 ERROR: unexpected error - Failed to authenticate GDrive: ('Unexpected credentials type', None, 'Expected', 'service_account')
4:07:12 AM: Traceback (most recent call last):
4:07:12 AM:   File "/opt/buildhome/python3.8/lib/python3.8/site-packages/dvc/output.py", line 1011, in get_dir_cache
4:07:12 AM:     ocheck(self.cache, obj)
4:07:12 AM:   File "/opt/buildhome/python3.8/lib/python3.8/site-packages/dvc_data/hashfile/__init__.py", line 20, in check
4:07:12 AM:     odb.check(obj.oid, **kwargs)
4:07:12 AM:   File "/opt/buildhome/python3.8/lib/python3.8/site-packages/dvc_data/hashfile/db/__init__.py", line 183, in check
4:07:12 AM:     _, actual = hash_file(obj.path, obj.fs, self.hash_name, self.state)
4:07:12 AM:   File "/opt/buildhome/python3.8/lib/python3.8/site-packages/dvc_data/hashfile/hash.py", line 178, in hash_file
4:07:12 AM:     hash_value, meta = _hash_file(path, fs, name, callback=cb, info=info)
4:07:12 AM:   File "/opt/buildhome/python3.8/lib/python3.8/site-packages/dvc_data/hashfile/hash.py", line 121, in _hash_file
4:07:12 AM:     info = info or fs.info(path)
4:07:12 AM:   File "/opt/buildhome/python3.8/lib/python3.8/site-packages/dvc_objects/fs/base.py", line 481, in info
4:07:12 AM:     return self.fs.info(path, **kwargs)
4:07:12 AM:   File "/opt/buildhome/python3.8/lib/python3.8/site-packages/dvc_objects/fs/local.py", line 42, in info
4:07:12 AM:     return self.fs.info(path)
4:07:12 AM:   File "/opt/buildhome/python3.8/lib/python3.8/site-packages/fsspec/implementations/local.py", line 87, in info
4:07:12 AM:     out = os.stat(path, follow_symlinks=False)
4:07:12 AM: FileNotFoundError: [Errno 2] No such file or directory: '/opt/build/repo/.dvc/cache/b3/3f1f6aa2be21da44cbbd212427c505.dir'
4:07:12 AM: During handling of the above exception, another exception occurred:
4:07:12 AM: Traceback (most recent call last):
4:07:12 AM:   File "/opt/buildhome/python3.8/lib/python3.8/site-packages/pydrive2/fs/spec.py", line 70, in _wrap_errors
4:07:12 AM:     yield
4:07:12 AM:   File "/opt/buildhome/python3.8/lib/python3.8/site-packages/pydrive2/fs/spec.py", line 150, in _service_auth
4:07:12 AM:     auth.ServiceAuth()
4:07:12 AM:   File "/opt/buildhome/python3.8/lib/python3.8/site-packages/pydrive2/auth.py", line 100, in _decorated
4:07:12 AM:     decoratee(self, *args, **kwargs)
4:07:12 AM:   File "/opt/buildhome/python3.8/lib/python3.8/site-packages/pydrive2/auth.py", line 319, in ServiceAuth
4:07:12 AM:     ServiceAccountCredentials.from_json_keyfile_dict(
4:07:12 AM:   File "/opt/buildhome/python3.8/lib/python3.8/site-packages/oauth2client/service_account.py", line 251, in from_json_keyfile_dict
4:07:12 AM:     return cls._from_parsed_json_keyfile(keyfile_dict, scopes,
4:07:12 AM:   File "/opt/buildhome/python3.8/lib/python3.8/site-packages/oauth2client/service_account.py", line 171, in _from_parsed_json_keyfile
4:07:12 AM:     raise ValueError('Unexpected credentials type', creds_type,
4:07:12 AM: ValueError: ('Unexpected credentials type', None, 'Expected', 'service_account')
4:07:12 AM: The above exception was the direct cause of the following exception:
4:07:12 AM: Traceback (most recent call last):
4:07:12 AM:   File "/opt/buildhome/python3.8/lib/python3.8/site-packages/dvc/cli/__init__.py", line 210, in main
4:07:12 AM:     ret = cmd.do_run()
4:07:12 AM:   File "/opt/buildhome/python3.8/lib/python3.8/site-packages/dvc/cli/command.py", line 26, in do_run
4:07:12 AM:     return self.run()
4:07:12 AM:   File "/opt/buildhome/python3.8/lib/python3.8/site-packages/dvc/commands/data_sync.py", line 31, in run
4:07:12 AM:     stats = self.repo.pull(
4:07:12 AM:   File "/opt/buildhome/python3.8/lib/python3.8/site-packages/dvc/repo/__init__.py", line 58, in wrapper
4:07:12 AM:     return f(repo, *args, **kwargs)
4:07:12 AM:   File "/opt/buildhome/python3.8/lib/python3.8/site-packages/dvc/repo/pull.py", line 34, in pull
4:07:12 AM:     processed_files_count = self.fetch(
4:07:12 AM:   File "/opt/buildhome/python3.8/lib/python3.8/site-packages/dvc/repo/__init__.py", line 58, in wrapper
4:07:12 AM:     return f(repo, *args, **kwargs)
4:07:12 AM:   File "/opt/buildhome/python3.8/lib/python3.8/site-packages/dvc/repo/fetch.py", line 86, in fetch
4:07:12 AM:     d, f = _fetch(
4:07:12 AM:   File "/opt/buildhome/python3.8/lib/python3.8/site-packages/dvc/repo/fetch.py", line 142, in _fetch
4:07:12 AM:     used = repo.used_objs(
4:07:12 AM:   File "/opt/buildhome/python3.8/lib/python3.8/site-packages/dvc/repo/__init__.py", line 476, in used_objs
4:07:12 AM:     for odb, objs in self.index.used_objs(
4:07:12 AM:   File "/opt/buildhome/python3.8/lib/python3.8/site-packages/dvc/repo/index.py", line 449, in used_objs
4:07:12 AM:     for odb, objs in stage.get_used_objs(
4:07:12 AM:   File "/opt/buildhome/python3.8/lib/python3.8/site-packages/dvc/stage/__init__.py", line 722, in get_used_objs
4:07:12 AM:     for odb, objs in out.get_used_objs(*args, **kwargs).items():
4:07:12 AM:   File "/opt/buildhome/python3.8/lib/python3.8/site-packages/dvc/output.py", line 1100, in get_used_objs
4:07:12 AM:     obj = self._collect_used_dir_cache(**kwargs)
4:07:15 AM: Failed during stage 'building site': Build script returned non-zero exit code: 2 (https://ntl.fyi/exit-code-2)
4:07:12 AM:   File "/opt/buildhome/python3.8/lib/python3.8/site-packages/dvc/output.py", line 1033, in _collect_used_dir_cache
4:07:12 AM:     self.get_dir_cache(jobs=jobs, remote=remote)
4:07:12 AM:   File "/opt/buildhome/python3.8/lib/python3.8/site-packages/dvc/output.py", line 1015, in get_dir_cache
4:07:12 AM:     self.repo.cloud.pull([obj.hash_info], **kwargs)
4:07:12 AM:   File "/opt/buildhome/python3.8/lib/python3.8/site-packages/dvc/data_cloud.py", line 181, in pull
4:07:12 AM:     return self.transfer(
4:07:12 AM:   File "/opt/buildhome/python3.8/lib/python3.8/site-packages/dvc/data_cloud.py", line 135, in transfer
4:07:12 AM:     return transfer(src_odb, dest_odb, objs, **kwargs)
4:07:12 AM:   File "/opt/buildhome/python3.8/lib/python3.8/site-packages/dvc_data/hashfile/transfer.py", line 203, in transfer
4:07:12 AM:     status = compare_status(
4:07:12 AM:   File "/opt/buildhome/python3.8/lib/python3.8/site-packages/dvc_data/hashfile/status.py", line 189, in compare_status
4:07:12 AM:     src_exists, src_missing = status(
4:07:12 AM:   File "/opt/buildhome/python3.8/lib/python3.8/site-packages/dvc_data/hashfile/status.py", line 134, in status
4:07:12 AM:     exists = hashes.intersection(
4:07:12 AM:   File "/opt/buildhome/python3.8/lib/python3.8/site-packages/dvc_data/hashfile/status.py", line 55, in _indexed_dir_hashes
4:07:12 AM:     dir_exists.update(
4:07:12 AM:   File "/opt/buildhome/python3.8/lib/python3.8/site-packages/tqdm/std.py", line 1183, in __iter__
4:07:12 AM:     for obj in iterable:
4:07:12 AM:   File "/opt/buildhome/python3.8/lib/python3.8/site-packages/dvc_objects/db.py", line 357, in list_oids_exists
4:07:12 AM:     in_remote = self.fs.exists(paths, batch_size=jobs)
4:07:12 AM:   File "/opt/buildhome/python3.8/lib/python3.8/site-packages/dvc_objects/fs/base.py", line 332, in exists
4:07:12 AM:     if self.fs.async_impl:
4:07:12 AM:   File "/opt/buildhome/python3.8/lib/python3.8/site-packages/funcy/objects.py", line 50, in __get__
4:07:12 AM:     return prop.__get__(instance, type)
4:07:12 AM:   File "/opt/buildhome/python3.8/lib/python3.8/site-packages/funcy/objects.py", line 28, in __get__
4:07:12 AM:     res = instance.__dict__[self.fget.__name__] = self.fget(instance)
4:07:12 AM:   File "/opt/buildhome/python3.8/lib/python3.8/site-packages/dvc_gdrive/__init__.py", line 105, in fs
4:07:12 AM:     return _GDriveFileSystem(self._path, **self._settings)
4:07:12 AM:   File "/opt/buildhome/python3.8/lib/python3.8/site-packages/fsspec/spec.py", line 76, in __call__
4:07:12 AM:     obj = super().__call__(*args, **kwargs)
4:07:12 AM:   File "/opt/buildhome/python3.8/lib/python3.8/site-packages/pydrive2/fs/spec.py", line 220, in __init__
4:07:12 AM:     google_auth = _service_auth(
4:07:12 AM:   File "/opt/buildhome/python3.8/lib/python3.8/site-packages/pydrive2/fs/spec.py", line 150, in _service_auth
4:07:12 AM:     auth.ServiceAuth()
4:07:12 AM:   File "/usr/lib/python3.8/contextlib.py", line 131, in __exit__
4:07:12 AM:     self.gen.throw(type, value, traceback)
4:07:12 AM:   File "/opt/buildhome/python3.8/lib/python3.8/site-packages/pydrive2/fs/spec.py", line 76, in _wrap_errors
4:07:12 AM:     raise GDriveAuthError("Failed to authenticate GDrive") from exc
4:07:12 AM: pydrive2.fs.spec.GDriveAuthError: Failed to authenticate GDrive
4:07:12 AM: 2023-02-28 09:07:12,930 DEBUG: Version info for developers:
4:07:12 AM: DVC version: 2.45.1 (pip)
4:07:12 AM: -------------------------
4:07:12 AM: Platform: Python 3.8.10 on Linux-5.4.228-131.415.amzn2.x86_64-x86_64-with-glibc2.29
4:07:12 AM: Subprojects:
4:07:12 AM: 	dvc_data = 0.40.3
4:07:12 AM: 	dvc_objects = 0.19.3
4:07:12 AM: 	dvc_render = 0.2.0
4:07:12 AM: 	dvc_task = 0.1.11
4:07:12 AM: 	dvclive = 2.1.0
4:07:12 AM: 	scmrepo = 0.1.11
4:07:12 AM: Supports:
4:07:12 AM: 	azure (adlfs = 2023.1.0, knack = 0.10.1, azure-identity = 1.12.0),
4:07:12 AM: 	gdrive (pydrive2 = 1.15.1),
4:07:12 AM: 	gs (gcsfs = 2023.1.0),
4:07:12 AM: 	hdfs (fsspec = 2023.1.0, pyarrow = 11.0.0),
4:07:12 AM: 	http (aiohttp = 3.8.4, aiohttp-retry = 2.8.3),
4:07:12 AM: 	https (aiohttp = 3.8.4, aiohttp-retry = 2.8.3),
4:07:12 AM: 	oss (ossfs = 2021.8.0),
4:07:12 AM: 	s3 (s3fs = 2023.1.0, boto3 = 1.24.59),
4:07:12 AM: 	ssh (sshfs = 2023.1.0),
4:07:12 AM: 	webdav (webdav4 = 0.9.8),
4:07:12 AM: 	webdavs (webdav4 = 0.9.8),
4:07:12 AM: 	webhdfs (fsspec = 2023.1.0)
4:07:12 AM: Cache types: <https://error.dvc.org/no-dvc-cache>
4:07:12 AM: Caches: local
4:07:12 AM: Remotes: gdrive, s3, gdrive
4:07:12 AM: Workspace directory: xfs on /dev/nvme0n1p1
4:07:12 AM: Repo: dvc, git
4:07:12 AM: Having any troubles? Hit us up at https://dvc.org/support, we are always happy to help!
4:07:12 AM: 2023-02-28 09:07:12,932 DEBUG: Analytics is enabled.
4:07:12 AM: 2023-02-28 09:07:12,959 DEBUG: Trying to spawn '['daemon', '-q', 'analytics', '/tmp/tmp02lbkm15']'
4:07:12 AM: 2023-02-28 09:07:12,960 DEBUG: Spawned '['daemon', '-q', 'analytics', '/tmp/tmp02lbkm15']'
4:07:13 AM: ​
4:07:13 AM:   "build.command" failed                                        
4:07:13 AM: ────────────────────────────────────────────────────────────────
4:07:13 AM: ​
4:07:13 AM:   Error message
4:07:13 AM:   Command failed with exit code 255: pip install --upgrade pip && pip install dvc[all] && echo '>>>>>>>>' && GDRIVE_CREDENTIALS_DATA=$GDRIVE_CREDENTIALS_DATA dvc pull -v -r gdrive-service-account && npx pnpm i --store=node_modules/.pnpm-store && npx pnpm run build (https://ntl.fyi/exit-code-255)
4:07:13 AM: ​
4:07:13 AM:   Error location
4:07:13 AM:   In Build command from Netlify app:
4:07:13 AM:   pip install --upgrade pip && pip install dvc[all] && echo '>>>>>>>>' && GDRIVE_CREDENTIALS_DATA=$GDRIVE_CREDENTIALS_DATA dvc pull -v -r gdrive-service-account && npx pnpm i --store=node_modules/.pnpm-store && npx pnpm run build
4:07:13 AM: ​
4:07:13 AM:   Resolved config
4:07:13 AM:   build:
4:07:13 AM:     base: /opt/build/repo/docs
4:07:13 AM:     command: pip install --upgrade pip && pip install dvc[all] && echo '>>>>>>>>' && GDRIVE_CREDENTIALS_DATA=$GDRIVE_CREDENTIALS_DATA dvc pull -v -r gdrive-service-account && npx pnpm i --store=node_modules/.pnpm-store && npx pnpm run build
4:07:13 AM:     commandOrigin: ui
4:07:13 AM:     environment:
4:07:13 AM:       - GDRIVE_CREDENTIALS_DATA
4:07:13 AM:       - NETLIFY_BUILD_DEBUG
4:07:13 AM:       - PYTHON_VERSION
4:07:13 AM:       - NPM_FLAGS
4:07:13 AM:     ignore: git diff --quiet $CACHED_COMMIT_REF $COMMIT_REF .
4:07:13 AM:     publish: /opt/build/repo/docs/build
4:07:13 AM:     publishOrigin: config
4:07:13 AM: Caching artifacts
4:07:13 AM: Started saving node modules
4:07:13 AM: Finished saving node modules
4:07:13 AM: Started saving build plugins
4:07:13 AM: Finished saving build plugins
4:07:13 AM: Started saving corepack cache
4:07:13 AM: Finished saving corepack cache
4:07:13 AM: Started saving pip cache
4:07:13 AM: Finished saving pip cache
4:07:13 AM: Started saving emacs cask dependencies
4:07:13 AM: Finished saving emacs cask dependencies
4:07:13 AM: Started saving maven dependencies
4:07:13 AM: Finished saving maven dependencies
4:07:13 AM: Started saving boot dependencies
4:07:13 AM: Finished saving boot dependencies
4:07:13 AM: Started saving rust rustup cache
4:07:13 AM: Finished saving rust rustup cache
4:07:13 AM: Started saving go dependencies
4:07:13 AM: Finished saving go dependencies
4:07:15 AM: Build failed due to a user error: Build script returned non-zero exit code: 2
4:07:15 AM: Failing build: Failed to build site
4:07:15 AM: Finished processing build request in 1m59.269s

Can someone provide more light on how sensitive environment variables are handled during builds and deploys?

I’ve tried printing my environmental variable to a local file, and then cat out that file. The output reads ****.

I’ve tried manually setting the environmental variable inline, e.g. GDRIVE_CREDENTIALS_DATA='<MY_CONTENTS>' but the command reads GDRIVE_CREDENTIALS_DATA='****' dvc pull.

I’ve set the Sensitive Variable Policy to be “Deploy without restrictions”. Doesn’t seem to make a difference.

Is there any way I can avoid obfuscating my environment variables?

Hey @theory,

We see something very different than what you mention here when we tried checking your site. Your current deploys seem to be failing at something else than dvc. Has something changed since you wrote in?