Flutter, AI and Testing: a new method to code faster and more safely

Flutter, AI and Testing: a new method to code faster and more safely

In 2025, the massive arrival of AI agents in our IDEs changed the way we code. But without a solid framework, their speed can become a risk. In this article, I show you how I structured a Flutter project around a winning trio: clear specifications, automated tests, AI-assisted. Result: a more reliable, faster, and much more pleasant workflow.

1. Why I revisited the way I code in Flutter

When I started with Flutter, I was mostly focused on UX/UI, animations, performance. But with the arrival of AI, a new challenge appeared:

How to collaborate with an AI without endangering an app's stability?

Coding faster is simple. Coding faster with confidence, that's another thing.

And that's where I rediscovered an old idea: no code without a clear intent.

2. Before the code: defining intentions (spec-driven development)

2.1. Why a spec above all?

Most AI prompts fail because context is missing. The AI has to guess what we want.

When you specify your intentions in a spec file, you transform your workspace into an explicit, controlled environment.

A spec can contain:

  • the description of screens and their roles
  • user flows
  • business rules
  • edge cases
  • errors to handle
  • technical constraints

Minimal spec example:

# Story: Scan d'une carte Pokémon (PSA)

## Objectif

L'utilisateur scanne une carte Pokémon gradée PSA (via le QR code ou le code de certification) pour l'ajouter à sa collection numérique.

## Flux principal

1. L'utilisateur ouvre l'écran de scan.
2. La caméra s'active automatiquement.
3. Le QR code / code PSA est détecté.
4. L'app récupère les données de la carte via l'API PSA (nom du Pokémon, set, numéro, grade, numéro de certification, éventuellement visuel).
5. L'app affiche une fiche détaillée avec les infos PSA + les infos propres à l'app (collection, tags, notes…).

## Règles métier

- Si la carte est déjà dans la collection → afficher un dialogue « Cette carte PSA est déjà dans ta collection ».
- Si l'API PSA ne répond pas → afficher une erreur explicite et proposer de réessayer.
- Si le QR/code est illisible → afficher un message d'erreur et permettre de rescanner.
- Le scan + appel API PSA doivent s'effectuer en moins de 500 ms dans des conditions normales.

This file becomes your source of truth, both for you and for your AI agent.

3. The AI as a "super fast junior dev"... but framed

3.1. What the AI does very well

  • generate classes, services, etc.
  • create reusable widgets
  • write tedious boilerplate
  • refactor code
  • suggest cleaner patterns

3.2. What it does poorly without context

  • handle business rules
  • maintain global consistency
  • anticipate edge cases
  • understand a complex flow without a spec

Hence the importance of a good AGENTS.md file, or a clear spec accessible to your agents. You build the framework, the AI builds the code.

4. The central role of tests in this workflow

To turn an AI agent into a reliable developer, tests become your safety net.

4.1. The three layers of Flutter tests

🟦 Unit tests

To test business logic:

import 'package:flutter_test/flutter_test.dart';
import 'package:myapp/domain/pokemon_service.dart';

void main() {
  group('PokemonService', () {
    test("ajout d'une carte PSA déjà présente", () {
      final service = PokemonService();

      service.addFromPsaCert('PSA-123');
      final result = service.addFromPsaCert('PSA-123');

      expect(result.isSuccess, false);
      expect(result.errorCode, 'already_exists');
    });
  });
}

🟨 Widget tests

To validate the UI + interactions without launching the full app. Example: verify that the scan screen correctly displays the PSA card preview returned by the API.

import 'package:flutter_test/flutter_test.dart';
import 'package:flutter/material.dart';
import 'package:myapp/ui/scan_screen.dart';
import 'package:myapp/domain/pokemon_service.dart';

class FakePokemonService extends PokemonService {
  @override
  Future<PokemonCard> fetchFromPsa(String certId) async {
    return PokemonCard(
      id: 'PSA-123',
      name: 'Pikachu',
      setName: 'Base Set',
      grade: '10',
    );
  }
}

void main() {
  testWidgets('affiche la fiche PSA après scan réussi', (WidgetTester tester) async {
    await tester.pumpWidget(
      MaterialApp(
        home: ScanScreen(
          pokemonService: FakePokemonService(),
        ),
      ),
    );

    // On simule ici le résultat d'un scan réussi (par ex. via un callback).
    final state = tester.state<ScanScreenState>(find.byType(ScanScreen));
    await state.onScanResult('PSA-123');
    await tester.pumpAndSettle();

    expect(find.text('Pikachu'), findsOneWidget);
    expect(find.text('Base Set'), findsOneWidget);
    expect(find.text('PSA 10'), findsOneWidget);
  });
}

🟩 Integration tests

To simulate a real user journey: open the app, go to the scan screen, scan a PSA card, view the card sheet, and optionally take a screenshot usable in an issue.

import 'package:flutter_test/flutter_test.dart';
import 'package:integration_test/integration_test.dart';
import 'package:myapp/main.dart' as app;
import 'dart:io';

void main() {
  IntegrationTestWidgetsFlutterBinding.ensureInitialized();

  testWidgets('parcours complet de scan PSA', (WidgetTester tester) async {
    app.main();
    await tester.pumpAndSettle();

    // Aller sur l'écran de scan
    await tester.tap(find.byKey(const Key('go-to-scan-button')));
    await tester.pumpAndSettle();

    // Ici tu peux mocker la couche caméra / scan pour renvoyer un certId PSA.
    // Par exemple via un FakeScanner ou un flag de debug.

    // Une fois le scan simulé, on attend le chargement de la fiche.
    await tester.pumpAndSettle(const Duration(seconds: 1));

    expect(find.text('Pikachu'), findsOneWidget);

    // Prendre un screenshot et le stocker dans un dossier lié à une issue
    final binding = IntegrationTestWidgetsFlutterBinding.instance;
    final bytes = await binding.takeScreenshot('scan_psa_flow');
    final dir = Directory('test_screenshots/issue-123');
    if (!dir.existsSync()) {
      dir.createSync(recursive: true);
    }
    final file = File('test_screenshots/issue-123/scan_psa_flow.png');
    await file.writeAsBytes(bytes);
  });
}

This pattern is super powerful: every identified bug can have its integration scenario + its screenshot in a dedicated folder (e.g. test_screenshots/issue-123/). It's precious for documenting regressions and facilitating communication within the team (or with your future self).

4.2. The complete AI + tests cycle

  1. You write/complete the spec.
  2. You let your AI agent digest it.
  3. The AI proposes code.
  4. You run the tests (unit, widget, integration).
  5. Everything is green → merge.
  6. A test fails → adjust the code or the spec.

4.3. Diagram of the complete workflow

We can summarize this workflow in a simple loop:

Loading diagram...

The important idea: docs and specs are not a one-time exercise. They evolve at the same pace as code and tests.

4.4. Better leverage screenshots in the testing process

Taking a screenshot at the end of an integration test is not just "for the sake of it". You can make it a real working tool:

  • 📌 Bug documentation: when you open an issue (GitHub, Jira...), you can attach the screenshot generated by the test (test_screenshots/issue-123/scan_psa_flow.png). You know exactly what the screen looked like when it broke.
  • 🧪 Lightweight visual regression: without setting up a heavy visual testing pipeline, you can already compare screenshots between two branches / two versions. A crude visual diff can reveal omissions (a button that disappears, overflowing text, etc.).
  • 🧭 Onboarding & storytelling: a well-organized test_screenshots folder is also a way to quickly show "what a given flow looks like" without launching the app. Handy when onboarding a new dev, a QA, or when documenting for your future self.
  • 🔁 Traceability: the issue ID in the path (issue-123) creates a natural link between spec → test → capture → ticket. When you close the ticket, you can keep the capture as a "before/after" photo in the discussion.

To go further, you can even automate uploading these screenshots as CI artifacts, or automatically post them as PR comments for critical flows.

5. Concrete example on a Flutter codebase

Here is an example of a tests + code structure generated in a real Flutter project:

lib/
  domain/
    pokemon_service.dart
  ui/
    scan_screen.dart
    pokemon_card.dart
  data/
    pokemon_repository.dart

test/
  unit/
    pokemon_service_test.dart
  widget/
    scan_screen_test.dart
  integration/
    scan_flow_test.dart

specs/
  scan_pokemon.md
  add_to_collection.md
  error_cases.md

And on the CI side:

flutter test
flutter test integration_test

This kind of organization makes AI + dev collaboration extremely smooth.

6. Why this method changes everything

Faster

The AI does the tedious work. You focus on logic, business, UX.

More reliable

The spec + tests turn the AI into a safe tool.

More maintainable

Refactoring becomes a real pleasure when everything is tested.

More aligned with the business

The reasoning starts from the spec, not from random implementation.

7. Update the docs: close the loop

There is often a missing step in our workflows: go back to the docs once the work is done.

Every time you:

  • add a new business rule,
  • fix a bug,
  • modify a flow,

...you should update:

  • the corresponding spec,
  • optionally the context file (AGENTS.md, architecture doc),
  • the reference to the related test and screenshot,

That's what turns your docs into living documentation, aligned with the code. The AI benefits (better context), and so do you when you come back to the project 6 months later.

8. Limitations (let's be honest)

  • Overkill for small features.
  • A bad spec = a bad result, even with AI.
  • Complex UI still requires manual finesse.
  • Requires a bit of discipline (but the maintainability gains are significant).

9. To go further: Spec Kit and AI-driven specs

If you want to push the approach further, you can take a look at Spec Kit, GitHub's open source tool designed for spec-driven development:

  • you write a detailed spec first (the "what" and the "why"),
  • the tool helps derive a technical plan and a task list,
  • then you let your AI agent generate code from those artifacts,
  • and you link all that to your existing tests.

The idea is the same as what we've seen, but formalized in a generic tool, designed to work with multiple AI assistants. Even if Spec Kit currently targets the web ecosystem, the philosophy is 100% reusable in another context:

  • start with solid specs,
  • keep a clear separation between intent (docs) and implementation (code),
  • use AI as a fast executor, not as the source of truth,
  • lock everything with automated tests.

You can easily imagine a future where:

  • Spec Kit (or an equivalent) describes your features at a high level,
  • an AI agent generates the Flutter layer (UI + logic),
  • and your Flutter tests validate that the implementation matches the spec.

In short, if you're a lead dev or thinking about industrializing your workflow with AI, it's worth experimenting with these kinds of tools, if only to see how your approach to specs evolves.

Conclusion

The future of Flutter development will not be just "coding faster with AI". It will be:

coding with intent + coding with assistance + coding with safety.

A workflow where:

  • the spec says what we want,
  • the AI proposes how to do it,
  • tests guarantee that it's well done.

And you, have you already tried this kind of workflow in Flutter? I'd be curious to read your feedback.

Tags

  • flutter

  • iOS

  • android

  • tests

  • AI

  • spec-driven-development

  • workflow

  • experience-report

  • architecture

  • best-practices

This article was posted on

Comments

Loading...

Flutter, AI and Testing: a new method to code faster and more safely | DEMILY Clément