lobe-chat/docs/development/basic/add-new-image-model.mdx · lo/lobehub - AtomGit

GGitHub🔨 chore: some ai image optimization (#8543 )
82ca0074创建于 2025年7月23日历史提交
# Adding New Image Models

> Learn more about the AI image generation modal design in the [AI Image Generation Modal Design Discussion](https://github.com/lobehub/lobe-chat/discussions/7442)

## Parameter Standardization

All image generation models must use the standard parameters defined in `src/libs/standard-parameters/index.ts`. This ensures parameter consistency across different Providers, creating a more unified user experience.

**Supported Standard Parameters**:

- `prompt` (required): Text prompt for image generation
- `aspectRatio`: Aspect ratio (e.g., "16:9", "1:1")
- `width` / `height`: Image dimensions
- `size`: Preset dimensions (e.g., "1024x1024")
- `seed`: Random seed
- `steps`: Generation steps
- `cfg`: Guidance scale
- For other parameters, please check the source file

## OpenAI Compatible Models

These models can be requested using the OpenAI SDK, with request parameters and return values consistent with DALL-E and GPT-Image-X series.

Taking Zhipu's CogView-4 as an example, which is an OpenAI-compatible model, you can add it by adding the model configuration in the corresponding AI models file `src/config/aiModels/zhipu.ts`:

```ts
const zhipuImageModels: AIImageModelCard[] = [
  // Add model configuration
  // https://bigmodel.cn/dev/howuse/image-generation-model/cogview-4
  {
    description:
      'CogView-4 is the first open-source text-to-image model from Zhipu that supports Chinese character generation, with comprehensive improvements in semantic understanding, image generation quality, and Chinese-English text generation capabilities.',
    displayName: 'CogView-4',
    enabled: true,
    id: 'cogview-4',
    parameters: {
      prompt: {
        default: '',
      },
      size: {
        default: '1024x1024',
        enum: ['1024x1024', '768x1344', '864x1152', '1344x768', '1152x864', '1440x720', '720x1440'],
      },
    },
    releasedAt: '2025-03-04',
    type: 'image',
  },
];
```

## Non-OpenAI Compatible Models

For image generation models that are not compatible with OpenAI format, you need to implement a custom `createImage` method. There are two main implementation approaches:

### Method 1: Using OpenAI Compatible Factory

Most Providers use `openaiCompatibleFactory` for OpenAI compatibility. You can pass in a custom `createImage` function (reference [PR #8534](https://github.com/lobehub/lobe-chat/pull/8534)).

**Implementation Steps**:

1. **Read Provider documentation and standard parameter definitions**
   - Review the Provider's image generation API documentation to understand request and response formats
   - Read `src/libs/standard-parameters/index.ts` to understand supported parameters
   - Add image model configuration in the corresponding AI models file

2. **Implement custom createImage method**
   - Create a standalone image generation function that accepts standard parameters
   - Convert standard parameters to Provider-specific format
   - Call the Provider's image generation API
   - Return a unified response format (imageUrl and optional width/height)

3. **Add tests**
   - Write unit tests covering success scenarios
   - Test various error cases and edge conditions

**Code Example**:

```ts
// src/libs/model-runtime/provider-name/createImage.ts
export const createProviderImage = async (
  payload: ImageGenerationPayload,
  options: any,
): Promise<ImageGenerationResponse> => {
  const { model, prompt, ...params } = payload;

  // Call Provider's native API
  const result = await callProviderAPI({
    model,
    prompt,
    // Convert parameter format
    custom_param: params.width,
    // ...
  });

  // Return unified format
  return {
    created: Date.now(),
    data: [{ url: result.imageUrl }],
  };
};
```

```ts
// src/libs/model-runtime/provider-name/index.ts
export const LobeProviderAI = openaiCompatibleFactory({
  constructorOptions: {
    // ... other configurations
  },
  createImage: createProviderImage, // Pass custom implementation
  provider: ModelProvider.ProviderName,
});
```

### Method 2: Direct Implementation in Provider Class

If your Provider has an independent class implementation, you can directly add the `createImage` method in the class (reference [PR #8503](https://github.com/lobehub/lobe-chat/pull/8503)).

**Implementation Steps**:

1. **Read Provider documentation and standard parameter definitions**
   - Review the Provider's image generation API documentation
   - Read `src/libs/standard-parameters/index.ts`
   - Add image model configuration in the corresponding AI models file

2. **Implement createImage method in Provider class**
   - Add the `createImage` method directly in the class
   - Handle parameter conversion and API calls
   - Return a unified response format

3. **Add tests**
   - Write comprehensive test cases for the new method

**Code Example**:

```ts
// src/libs/model-runtime/provider-name/index.ts
export class LobeProviderAI {
  async createImage(
    payload: ImageGenerationPayload,
    options?: ChatStreamCallbacks,
  ): Promise<ImageGenerationResponse> {
    const { model, prompt, ...params } = payload;

    // Call native API and handle response
    const result = await this.client.generateImage({
      model,
      prompt,
      // Parameter conversion
    });

    return {
      created: Date.now(),
      data: [{ url: result.url }],
    };
  }
}
```

### Important Notes

- **Testing Requirements**: Add comprehensive unit tests for custom implementations, ensuring coverage of success scenarios and various error cases
- **Error Handling**: Use `AgentRuntimeError` consistently for error wrapping to maintain error message consistency