* initial version of a FAST pre-install skill * first round of testing * Update fast-0-org-setup-prereqs skill with improved UX and local path handling - Add explicit lockout warning and stop condition if the user is not a member of the provided Admin Principal group. - Streamline bootstrap project selection to only prompt for an override if the active gcloud project is rejected. - Restrict dataset discovery strictly to the `fast/stages/0-org-setup/datasets/` directory. - Improve location handling by referencing `defaults.schema.json` for Standard GCP and auto-configuring fixed regions for GCD. - Add comprehensive `local_path` management: prompt for customization, create directories, move `defaults.yaml` to the local data folder, and symlink `0-org-setup.auto.tfvars` back to the stage directory. * add testing scenarios, implement initial changes for scenario 2 * move skills * move to a skills/fast subfolder * Refactor fast-0-org-setup prereqs skill * Add skill-turn-harness utility tool * Use relative markdown links for skill references * Use descriptive titles for markdown links in skill references * Add descriptions to each phase in the prerequisites workflow map * Use backslash for markdown line breaks in skill map * Update README security warning to mention default .gitignore * shebang * Update fast prereqs skill rules to force sequential question flow and refine harness tool with proper ctrl+c handling and slugified log paths * Move playbook-gcp-dev.yaml to fast/prerequisites/gcp-dev.yaml and update fast prerequisites * docs(skill-turn-harness): detail autonomous pond testing approach * docs(skill-turn-harness): add final_state_checks to pond architecture and update toc * Refine fast prereqs SKILL and gcp-dev playbook to strictly align with one-question-at-a-time rule * feat(skill-turn-harness): update playbook schema for autonomous persona mode * feat(skill-turn-harness): implement autonomous persona testing mode and fallback logic * docs(skill-turn-harness): document the three modes of testing and update ToC * implement timeout, schema validation, configurable cli * chore: remove accidentally committed log files * chore: ignore logs directory * feat(skill-harness): implement tool execution interception, configurable workspace, and modularized validation * feat(skill-harness): add model configuration and update README * fix(skill-harness): automatically inject -y flag to gemini commands * docs(skill-harness): add TODO.md with analysis for skill environment dependencies * feat(skill-harness): add working_dir support and clean up fixtures - Implement working_dir in harness to run tests in specific directories. - Rename test fixtures and playbooks to be more descriptive. - Add E2E test for working_dir. - Apply code quality improvements to harness.py (imports, linting). - Update README with working directory considerations and usage notes. - Update phase3-bootstrap-and-iam.md skill doc to add execution rule against creating temp scripts. * fix: capture customer_id and respect relative paths * Implement isolated temp workspace sandboxing with symlinks in test harness * Configure GCD manual autonomous playbook and align Phase 3/4 steps order * Fix linting and schema tests failures - Add missing license headers to tools/skill-turn-harness files. - Fix trailing spaces and newlines in playbooks. - Ignore tools directory in schema tests workflow. TAG=agy CONV=1bb75453-c3e2-448b-bae9-8e332a068012 * Fix Python formatting with yapf TAG=agy CONV=1bb75453-c3e2-448b-bae9-8e332a068012 * Refactor skill-turn-harness to use Antigravity SDK - Migrated harness from gemini-cli subprocesses to Antigravity SDK. - Implemented real-time step streaming and console logging. - Added color-coded terminal output (dark gray headers, blue inputs, pink outputs). - Collapsed excessive newlines in streamed thoughts. - Excluded harness codebase from workspace copy to prevent agent cheating. - Enabled skills folder copy to resolve agent lookup loops. - Added key validation and CLI --debug flag. * Fix autonomous turn layout: print Turn ID before execution - Moved the [Autonomous Turn X] header print to before running the agent turn. - This groups the real-time thinking and tool calls under the correct Turn ID block, instead of displaying them before the label. * Remove obsolete .log.md from prerequisites skill directory
40 lines
3.0 KiB
Markdown
40 lines
3.0 KiB
Markdown
# Test Harness TODOs & Analysis
|
|
|
|
## The Problem: Skill Environment Dependencies
|
|
|
|
During E2E testing of the `fast-0-org-setup-prereqs` skill, the autonomous agent hallucinated that it needed to `git clone` the `cloud-foundation-fabric` repository.
|
|
|
|
This occurred because the test harness executes the Gemini CLI in a completely empty, isolated temporary workspace (`/tmp/gemini_harness_*`). However, the FAST skill *assumes* it is being executed from the root of the `cloud-foundation-fabric` repository because it needs to:
|
|
1. Read available datasets from `fast/stages/0-org-setup/datasets/*/defaults.yaml`.
|
|
2. Create a symbolic link from the generated `0-org-setup.auto.tfvars` file to `fast/stages/0-org-setup/0-org-setup.auto.tfvars`.
|
|
|
|
Because these directories did not exist in the isolated workspace, the agent failed to complete Phase 4 of the setup.
|
|
|
|
## Proposed Solutions for Next Session
|
|
|
|
We need a way to provide the agent with the expected repository structure during tests without compromising the safety and isolation of the test harness.
|
|
|
|
### Option 1: Add a `working_dir` Playbook Attribute
|
|
Allow the playbook to specify a `working_dir` (e.g., the actual path to the `cloud-foundation-fabric` repo) where the CLI should be executed, bypassing the temporary workspace creation.
|
|
|
|
**Risks & Impacts:**
|
|
* **Chat History Pollution:** The test's conversation will be saved to the global `~/.gemini/tmp/cloud-foundation-fabric/chats/` directory, mixing test runs with the developer's actual day-to-day CLI usage.
|
|
* **Session Retrieval:** The harness will need to be updated to find the *newest* `session-*.json` file in that directory, rather than assuming it's the only one.
|
|
* **File Modification Risk:** If the agent hallucinates or a test is poorly written, it could modify or delete real files in the repository instead of sandboxed test files.
|
|
* **Cleanup:** The harness cannot safely clean up the workspace after the test completes.
|
|
|
|
### Option 2: The "Symlink Sandbox" (Recommended)
|
|
Keep the isolated temporary workspace (`/tmp/gemini_harness_*`), but add a `symlink_paths` array to the playbook schema.
|
|
|
|
Before the test starts, the harness would dynamically create symbolic links from the real repository (e.g., `fast/`, `modules/`) into the temporary workspace.
|
|
|
|
**Benefits:**
|
|
* **Total Isolation:** The agent's chat history remains isolated in a temporary `.gemini/tmp/gemini_harness_*/` directory.
|
|
* **Safe Execution:** The agent sees the directory structure it expects (and can read the `defaults.yaml` files), but any new files it creates (like the `custom-fast-config` directory or the `0-org-setup.auto.tfvars` symlink) are created safely inside the temporary workspace.
|
|
* **Automatic Cleanup:** The entire workspace (including the symlinks and generated files) is safely deleted when the test finishes.
|
|
|
|
## Next Steps
|
|
1. Decide between Option 1 (`working_dir`) and Option 2 (`symlink_paths`).
|
|
2. Implement the chosen solution in `harness.py`.
|
|
3. Update `playbook.schema.json` and `README.md`.
|
|
4. Re-run the `gcp-dev-autonomous.yaml` E2E test to verify Phase 4 completes successfully. |