ImplementationProcess

The Five-Week AI Agent Deployment: A Week-by-Week Breakdown

Takuya Matsumoto June 25, 2024

We run every Askhub implementation in five weeks. This post walks through each week in detail.

When we tell a new customer that we can have an AI agent live in their environment within five weeks, the response is usually some variation of skepticism. They have been in meetings with SI vendors who quoted nine months. They have read articles about enterprise AI timelines measured in years. Five weeks sounds like a sales claim.

It is not. It is the result of a deployment architecture and a scoping discipline that we have refined across multiple implementations. This post is the unvarnished week-by-week breakdown — what we do, what the customer needs to do, and what can actually slow things down.

One upfront clarification: five weeks assumes a reasonably bounded initial use case. A company that wants to deploy an agent across 15 data sources with 8 custom workflow integrations in week one is not a five-week project. The five-week timeline applies to a production-grade agent for a specific use case with 2-4 defined data sources. Extensions and expansions come after production launch.

Week 1: Discovery and Data Connectivity

The first week is the most important and the one most likely to be underestimated. Its purpose is not to write code — it is to understand the actual data landscape and establish live connections to the source systems the agent will query.

Day 1 is a scoping workshop. We map the specific business question the agent will answer, identify who the end users are, define what a correct answer looks like versus an acceptable approximate answer versus an unacceptable response, and agree on the success criteria we will measure in production. This conversation needs to include the person who will use the agent every day, not just the DX project manager. The specifics of how employees phrase questions, what edge cases matter, and where the current manual process breaks down — this knowledge lives with end users, not with IT.

Days 2 through 5 are data connectivity. Askhub connectors are pointed at the actual systems: the kintone apps, the SharePoint folder structure, the REST API endpoint, the PostgreSQL table. We validate that live data comes through correctly, that authentication works, and that the data model is understood well enough to write meaningful queries against it. Any access credentials or IP allowlisting that need to be configured on the customer's infrastructure side happen this week — delays here push everything else back by the same number of days. This is the most common cause of timeline slippage: customers who cannot provision API access or network allowlisting within the first week.

Week 2: Agent Flow Design and Knowledge Base Assembly

With live data connections in place, week 2 is where the agent's intelligence is built. This involves two parallel streams of work.

The first stream is knowledge base assembly: chunking and indexing the document corpus, configuring retrieval parameters, and running test queries against the live knowledge base to verify that relevant chunks are being retrieved for representative questions. This is methodical work. We test against at least 30 representative queries drawn from the actual end users' expected question patterns — not generic test queries we invent ourselves. The retrieval quality at the end of week 2 is the ceiling for agent answer quality; no amount of prompt engineering can compensate for retrieval that returns irrelevant chunks.

The second stream is agent flow design in the Askhub builder. We define the conversational flow: how the agent handles a well-scoped question it can answer confidently, how it handles a question that falls outside the indexed knowledge base, how it handles ambiguous queries that need clarification, and when and how it escalates to a human. For customer-facing agents, we also define the agent's tone and the Japanese language register (casual versus formal keigo versus teineigo — this matters significantly for user acceptance in Japanese enterprise contexts).

Customer involvement this week: review of the 30-query test set to confirm these are representative of real user questions, plus review of the escalation logic and language register decisions. Approximately 2-3 hours of stakeholder time.

Week 3: Integration Testing and Edge Case Handling

Week 3 runs the agent against the full customer data set with a broader set of test scenarios. This is where the interesting problems surface. A manufacturing company's internal knowledge agent, for example, was handling product specification queries well but giving confused answers when users asked questions that crossed the boundary between the product catalog app and the technical support ticket history — two kintone apps that our week-1 mapping had identified as semantically related but structurally separate. Week 3 was where we built the cross-app resolution logic to handle that class of question correctly.

We document every test failure and classify it: retrieval failure (wrong chunks returned), reasoning failure (correct chunks but wrong synthesis), out-of-scope query (question the agent should not try to answer), or edge case in the agent flow logic. Retrieval failures go back to the indexing configuration. Reasoning failures often point to prompt refinements or context structure adjustments. Out-of-scope queries get added to the agent's explicit refusal cases. Flow logic edge cases get patched in the Askhub builder.

This week also includes a security review pass: checking that the agent cannot be induced to return data outside the access permissions agreed in week 1, that no raw internal document text is being surfaced verbatim in ways that would violate internal confidentiality expectations, and that the deployment channel (Slack, Teams, or web widget) has the appropriate access controls configured.

Week 4: User Acceptance Testing and Stakeholder Review

Week 4 is the customer's week. A controlled group of actual end users — typically 5-10 people from the target business unit — get access to the agent in a staging environment and use it for real work for three to four days. We observe but do not intervene during this period. The goal is to surface the failure patterns that structured testing missed, because users ask questions in ways that were never anticipated during the flow design phase.

At the end of week 4, we run a structured review session. Every notable failure case from UAT gets triaged in the room: is this a retrieval issue, a flow issue, or a data gap (content that should be in the knowledge base but is not)? Data gaps — missing documents, outdated content — are flagged for the customer to address before launch. Flow and retrieval issues are fixed by us in real time.

We are not saying UAT will catch everything. We are saying that three to four days of real-user testing under production-representative conditions will surface the issues that matter most, and week 4 is the last moment to address them without the pressure of a live deployment.

Week 5: Production Deployment, Monitoring Setup, and Handover

Production deployment day is typically Tuesday or Wednesday of week 5 — never a Friday. We deploy to the target channel, notify users, and monitor live query traffic in real time for the first four hours. The most common production-day issue is query volume: a surge of initial user curiosity creates more concurrent requests than the staging environment tested, occasionally surfacing a caching configuration that was tuned for lower concurrency. These are fast to resolve but need a monitoring-eyes-on period to catch.

The second half of week 5 is handover. We configure the monitoring dashboards the customer will use to track agent health: query volume, response latency, success rate (defined as queries that did not result in an escalation or an explicit "I cannot answer this" response), and the weekly document refresh status. We run a handover session with the person designated as the ongoing agent owner — typically a technical person in the IT or DX team — covering how to add new documents to the knowledge base, how to adjust the agent flow in the Askhub builder, and what the monitoring metrics mean and when to escalate a concern to us.

The goal of the handover is that the customer owns the agent from this point forward. We remain available for support and for planned expansions, but the production agent should not require our involvement to run on a daily basis. Dependency on the vendor for routine operations is how agents get abandoned when the vendor relationship changes.

What Can Extend the Timeline

Three things most reliably push a deployment past five weeks: access provisioning delays in week 1 (the single biggest cause), a knowledge base that requires significant content work before it can be indexed (outdated documents, missing coverage, content locked in formats that cannot be parsed cleanly), and UAT sign-off delays caused by stakeholder scheduling conflicts. All three are on the customer side, not the platform side. We can only compress the timeline on variables we control.

When we scope a new project, we state these three risk factors explicitly in the kickoff documentation. Not as caveats, but as items that need a designated owner and a hard deadline on the customer's side before we commit to a go-live date. Five weeks is achievable and repeatable — with the right preparation in place before week 1 starts.

Week 1: Discovery and Data Connectivity

Week 2: Agent Flow Design and Knowledge Base Assembly

Week 3: Integration Testing and Edge Case Handling

Week 4: User Acceptance Testing and Stakeholder Review

Week 5: Production Deployment, Monitoring Setup, and Handover

What Can Extend the Timeline

More from the blog

Why Japanese Enterprise AI Pilots Stall

HR Knowledge Agent Case Study

Data Sovereignty and AI Agents