引言

当前ai模型很多都要收费,所以尝试部署现有的免费本地模型,验证能否进行顺畅的聊天功能,方便部署到以后的项目中。

本地模型选择与准备

目前我安装了三个轻量模型:

#qwen:0.5b  最轻太笨
#llama3.2:1b  1b聪明一点
#llama3.2:3b  2b的聊天更顺畅

开发环境搭建

  • JDK版本要求(我使用1.8的jdk,因为懒得换)
  • 依赖库引入(直接复制pom.xml)
  • IDE配置(IntelliJ IDEA+maven)

前期准备

  • 双击安装包,一路点Next/Install,1 分钟搞定,安装后自动启动服务,系统托盘出现羊驼图标就成功了

下载模型:

在 PowerShell 里执行,各模型区别如下:

Ollama 常用模型速查(Windows 友好)
1)超级轻量款(秒下载、秒运行,适合测试)

powershell

ollama run qwen:0.5b
  • 模型:通义千问 0.5B
  • 大小:约 300MB
  • 优点:极快、不占资源,最适合你现在尝鲜
  • 用途:测试 LangChain、体验大模型流程

powershell

ollama run tinyllama
  • 极小、超快
2)轻量可用款(日常问答够用)

powershell

ollama run qwen:2b
  • 通义千问 2B
  • 大小:约 1.2GB
  • 回答质量明显比 0.5B 好

powershell

ollama run llama3:2b
  • Meta Llama 3 2B,英文很强,中文也不错
3)正常体验款(效果接近正经助手)

powershell

ollama run qwen:7b

powershell

ollama run llama3:8b
  • 大小:约 4~5GB
  • 要求:电脑至少 16GB 内存 才流畅

以下是java 项目结构和所有源码,复制可用:

D:\demo01
├─ pom.xml
├─ settings.xml
└─ src
   └─ main
      ├─ java
      │  └─ com
      │     └─ example
      │        └─ ollamachatmemory
      │           ├─ OllamaChatMemoryApplication.java        // Spring Boot 启动类
      │           ├─ config
      │           │  └─ OllamaProperties.java                // Ollama 配置属性(baseUrl、defaultModel 等)
      │           ├─ controller
      │           │  ├─ ChatController.java                  // /api/v1/chat 相关 REST 接口
      │           │  └─ HomeController.java                  // 根路径 "/" 转发到 index.html
      │           ├─ dto
      │           │  ├─ ChatRequest.java                     // 聊天请求体(sessionId, message, model)
      │           │  └─ ChatResponse.java                    // 聊天响应体(sessionId, model, answer, historyMessageCount)
      │           ├─ model
      │           │  ├─ ChatMessage.java                     // 单条消息(role, content)
      │           │  ├─ OllamaChatRequest.java               // 发给 Ollama 的请求体(model, messages, stream)
      │           │  └─ OllamaChatResponse.java              // Ollama 返回的响应体(model, message)
      │           └─ service
      │              └─ OllamaChatService.java               // 核心聊天服务:会话历史管理 + 调用 Ollama
      └─ resources
         ├─ application.yml                                  // 应用配置(端口、Ollama 地址、maxHistoryMessages 等)
         └─ static
            └─ index.html                                    // 前端聊天页面(支持上下文记忆展示)
pom.xml
<project xmlns="http://maven.apache.org/POM/4.0.0"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>

    <parent>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-parent</artifactId>
        <version>2.7.18</version>
        <relativePath/>
    </parent>

    <groupId>com.example</groupId>
    <artifactId>ollama-chat-memory</artifactId>
    <version>0.0.1-SNAPSHOT</version>
    <name>ollama-chat-memory</name>
    <description>Spring Boot chat demo with Ollama memory</description>

    <properties>
        <java.version>1.8</java.version>
    </properties>

    <dependencies>
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-web</artifactId>
        </dependency>
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-configuration-processor</artifactId>
            <optional>true</optional>
        </dependency>
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-validation</artifactId>
        </dependency>
        <dependency>
            <groupId>org.springframework.boot</groupId>
            <artifactId>spring-boot-starter-test</artifactId>
            <scope>test</scope>
        </dependency>
    </dependencies>

    <build>
        <plugins>
            <plugin>
                <groupId>org.springframework.boot</groupId>
                <artifactId>spring-boot-maven-plugin</artifactId>
            </plugin>
        </plugins>
    </build>
</project>
settings.xml
<?xml version="1.0" encoding="UTF-8"?>
<settings xmlns="http://maven.apache.org/SETTINGS/1.2.0"
          xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
          xsi:schemaLocation="http://maven.apache.org/SETTINGS/1.2.0 https://maven.apache.org/xsd/settings-1.2.0.xsd">
    
    <mirrors>
        <!-- 阿里云镜像 -->
        <mirror>
            <id>aliyunmaven</id>
            <mirrorOf>central</mirrorOf>
            <name>阿里云公共仓库</name>
            <url>https://maven.aliyun.com/repository/public</url>
        </mirror>
    </mirrors>

    <profiles>
        <profile>
            <id>jdk18</id>
            <activation>
                <activeByDefault>true</activeByDefault>
                <jdk>1.8</jdk>
            </activation>
            <properties>
                <maven.compiler.source>1.8</maven.compiler.source>
                <maven.compiler.target>1.8</maven.compiler.target>
                <maven.compiler.compilerVersion>1.8</maven.compiler.compilerVersion>
            </properties>
        </profile>
    </profiles>

</settings>
OllamaProperties
package com.example.ollamachatmemory.config;

import org.springframework.boot.context.properties.ConfigurationProperties;

@ConfigurationProperties(prefix = "ollama")
public class OllamaProperties {

    private String baseUrl = "http://localhost:11434";
    private String defaultModel = "llama3.2:1b";
    private int maxHistoryMessages = 20;

    public String getBaseUrl() {
        return baseUrl;
    }

    public void setBaseUrl(String baseUrl) {
        this.baseUrl = baseUrl;
    }

    public String getDefaultModel() {
        return defaultModel;
    }

    public void setDefaultModel(String defaultModel) {
        this.defaultModel = defaultModel;
    }

    public int getMaxHistoryMessages() {
        return maxHistoryMessages;
    }

    public void setMaxHistoryMessages(int maxHistoryMessages) {
        this.maxHistoryMessages = maxHistoryMessages;
    }
}
ApiExceptionHandler
package com.example.ollamachatmemory.controller;

import org.springframework.http.HttpStatus;
import org.springframework.http.ResponseEntity;
import org.springframework.validation.FieldError;
import org.springframework.web.bind.MethodArgumentNotValidException;
import org.springframework.web.bind.annotation.ExceptionHandler;
import org.springframework.web.bind.annotation.RestControllerAdvice;
import org.springframework.web.client.RestClientException;

import java.util.LinkedHashMap;
import java.util.Map;
import java.util.stream.Collectors;

@RestControllerAdvice
public class ApiExceptionHandler {

    @ExceptionHandler(MethodArgumentNotValidException.class)
    public ResponseEntity<Map<String, String>> handleValidation(MethodArgumentNotValidException ex) {
        String message = ex.getBindingResult()
                .getFieldErrors()
                .stream()
                .map(FieldError::getDefaultMessage)
                .collect(Collectors.joining("; "));

        Map<String, String> body = new LinkedHashMap<>();
        body.put("error", "validation_error");
        body.put("message", message);
        return ResponseEntity.badRequest().body(body);
    }

    @ExceptionHandler(RestClientException.class)
    public ResponseEntity<Map<String, String>> handleRestClient(RestClientException ex) {
        Map<String, String> body = new LinkedHashMap<>();
        body.put("error", "ollama_unavailable");
        body.put("message", "Cannot connect to Ollama. Confirm Ollama is running at localhost:11434.");
        body.put("detail", ex.getMessage());
        return ResponseEntity.status(HttpStatus.BAD_GATEWAY).body(body);
    }

    @ExceptionHandler(IllegalStateException.class)
    public ResponseEntity<Map<String, String>> handleIllegalState(IllegalStateException ex) {
        Map<String, String> body = new LinkedHashMap<>();
        body.put("error", "ollama_response_error");
        body.put("message", ex.getMessage());
        return ResponseEntity.status(HttpStatus.BAD_GATEWAY).body(body);
    }
}
ChatController
package com.example.ollamachatmemory.controller;

import com.example.ollamachatmemory.dto.ChatRequest;
import com.example.ollamachatmemory.dto.ChatResponse;
import com.example.ollamachatmemory.model.ChatMessage;
import com.example.ollamachatmemory.service.OllamaChatService;
import javax.validation.Valid;
import org.springframework.http.ResponseEntity;
import org.springframework.web.bind.annotation.DeleteMapping;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.PathVariable;
import org.springframework.web.bind.annotation.PostMapping;
import org.springframework.web.bind.annotation.RequestBody;
import org.springframework.web.bind.annotation.RequestMapping;
import org.springframework.web.bind.annotation.RestController;

import java.util.HashMap;
import java.util.List;
import java.util.Map;

@RestController
@RequestMapping("/api/v1/chat")
public class ChatController {

    private final OllamaChatService chatService;

    public ChatController(OllamaChatService chatService) {
        this.chatService = chatService;
    }

    @PostMapping
    public ResponseEntity<ChatResponse> chat(@Valid @RequestBody ChatRequest request) {
        ChatResponse response = chatService.chat(request.getSessionId(), request.getMessage(), request.getModel());
        return ResponseEntity.ok(response);
    }

    @GetMapping("/{sessionId}/history")
    public ResponseEntity<List<ChatMessage>> history(@PathVariable String sessionId) {
        return ResponseEntity.ok(chatService.getHistory(sessionId));
    }

    @DeleteMapping("/{sessionId}/history")
    public ResponseEntity<Map<String, String>> clearHistory(@PathVariable String sessionId) {
        chatService.clearSession(sessionId);
        Map<String, String> result = new HashMap<>();
        result.put("message", "history cleared");
        result.put("sessionId", sessionId);
        return ResponseEntity.ok(result);
    }
}
HomeController
package com.example.ollamachatmemory.controller;

import org.springframework.stereotype.Controller;
import org.springframework.web.bind.annotation.GetMapping;

@Controller
public class HomeController {

    @GetMapping("/")
    public String home() {
        return "forward:/index.html";
    }
}
ChatRequest
package com.example.ollamachatmemory.dto;

import javax.validation.constraints.NotBlank;

public class ChatRequest {

    @NotBlank(message = "sessionId cannot be blank")
    private String sessionId;

    @NotBlank(message = "message cannot be blank")
    private String message;

    private String model;

    public String getSessionId() {
        return sessionId;
    }

    public void setSessionId(String sessionId) {
        this.sessionId = sessionId;
    }

    public String getMessage() {
        return message;
    }

    public void setMessage(String message) {
        this.message = message;
    }

    public String getModel() {
        return model;
    }

    public void setModel(String model) {
        this.model = model;
    }
}
ChatResponse
package com.example.ollamachatmemory.dto;

public class ChatResponse {

    private String sessionId;
    private String model;
    private String answer;
    private int historyMessageCount;

    public ChatResponse() {
    }

    public ChatResponse(String sessionId, String model, String answer, int historyMessageCount) {
        this.sessionId = sessionId;
        this.model = model;
        this.answer = answer;
        this.historyMessageCount = historyMessageCount;
    }

    public String getSessionId() {
        return sessionId;
    }

    public void setSessionId(String sessionId) {
        this.sessionId = sessionId;
    }

    public String getModel() {
        return model;
    }

    public void setModel(String model) {
        this.model = model;
    }

    public String getAnswer() {
        return answer;
    }

    public void setAnswer(String answer) {
        this.answer = answer;
    }

    public int getHistoryMessageCount() {
        return historyMessageCount;
    }

    public void setHistoryMessageCount(int historyMessageCount) {
        this.historyMessageCount = historyMessageCount;
    }
}
ChatMessage
package com.example.ollamachatmemory.model;

public class ChatMessage {

    private String role;
    private String content;

    public ChatMessage() {
    }

    public ChatMessage(String role, String content) {
        this.role = role;
        this.content = content;
    }

    public String getRole() {
        return role;
    }

    public void setRole(String role) {
        this.role = role;
    }

    public String getContent() {
        return content;
    }

    public void setContent(String content) {
        this.content = content;
    }
}
OllamaChatRequest
package com.example.ollamachatmemory.model;

import java.util.List;

public class OllamaChatRequest {

    private String model;
    private List<ChatMessage> messages;
    private boolean stream;

    public OllamaChatRequest() {
    }

    public OllamaChatRequest(String model, List<ChatMessage> messages, boolean stream) {
        this.model = model;
        this.messages = messages;
        this.stream = stream;
    }

    public String getModel() {
        return model;
    }

    public void setModel(String model) {
        this.model = model;
    }

    public List<ChatMessage> getMessages() {
        return messages;
    }

    public void setMessages(List<ChatMessage> messages) {
        this.messages = messages;
    }

    public boolean isStream() {
        return stream;
    }

    public void setStream(boolean stream) {
        this.stream = stream;
    }
}
OllamaChatResponse
package com.example.ollamachatmemory.model;

public class OllamaChatResponse {

    private String model;
    private ChatMessage message;

    public String getModel() {
        return model;
    }

    public void setModel(String model) {
        this.model = model;
    }

    public ChatMessage getMessage() {
        return message;
    }

    public void setMessage(ChatMessage message) {
        this.message = message;
    }
}
OllamaChatService
package com.example.ollamachatmemory.service;

import com.example.ollamachatmemory.config.OllamaProperties;
import com.example.ollamachatmemory.dto.ChatResponse;
import com.example.ollamachatmemory.model.ChatMessage;
import com.example.ollamachatmemory.model.OllamaChatRequest;
import com.example.ollamachatmemory.model.OllamaChatResponse;
import org.springframework.http.MediaType;
import org.springframework.stereotype.Service;
import org.springframework.util.StringUtils;
import org.springframework.web.client.RestTemplate;
import org.springframework.http.client.ClientHttpRequestFactory;
import org.springframework.http.client.SimpleClientHttpRequestFactory;

import java.util.ArrayList;
import java.util.Collections;
import java.util.Deque;
import java.util.List;
import java.util.Objects;
import java.util.concurrent.ConcurrentHashMap;
import java.util.concurrent.ConcurrentMap;
import java.util.concurrent.LinkedBlockingDeque;

@Service
public class OllamaChatService {

    private final RestTemplate restTemplate;
    private final OllamaProperties ollamaProperties;
    private final ConcurrentMap<String, Deque<ChatMessage>> sessionHistory = new ConcurrentHashMap<>();

    public OllamaChatService(OllamaProperties ollamaProperties) {
        ClientHttpRequestFactory factory = new SimpleClientHttpRequestFactory();
        this.restTemplate = new RestTemplate(factory);
        this.ollamaProperties = ollamaProperties;
    }

    public ChatResponse chat(String sessionId, String userMessage, String requestedModel) {
        Deque<ChatMessage> history = sessionHistory.computeIfAbsent(sessionId, key -> new LinkedBlockingDeque<>());
        String model = StringUtils.hasText(requestedModel) ? requestedModel : ollamaProperties.getDefaultModel();

        synchronized (history) {
            history.addLast(new ChatMessage("user", userMessage));

            OllamaChatRequest payload = new OllamaChatRequest(
                    model,
                    new ArrayList<>(history),
                    false
            );

            String url = ollamaProperties.getBaseUrl() + "/api/chat";
            OllamaChatResponse response = restTemplate.postForObject(url, payload, OllamaChatResponse.class);

            String assistantReply = extractAssistantReply(response);
            history.addLast(new ChatMessage("assistant", assistantReply));
            trimHistory(history);

            return new ChatResponse(sessionId, model, assistantReply, history.size());
        }
    }

    public void clearSession(String sessionId) {
        sessionHistory.remove(sessionId);
    }

    public List<ChatMessage> getHistory(String sessionId) {
        Deque<ChatMessage> history = sessionHistory.get(sessionId);
        if (history == null) {
            return Collections.emptyList();
        }
        synchronized (history) {
            return new ArrayList<>(history);
        }
    }

    private String extractAssistantReply(OllamaChatResponse response) {
        if (response == null || response.getMessage() == null || !StringUtils.hasText(response.getMessage().getContent())) {
            throw new IllegalStateException("No assistant content returned from Ollama.");
        }
        return response.getMessage().getContent();
    }

    private void trimHistory(Deque<ChatMessage> history) {
        int maxHistoryMessages = Math.max(2, ollamaProperties.getMaxHistoryMessages());
        while (history.size() > maxHistoryMessages) {
            history.pollFirst();
        }
        while (!history.isEmpty() && Objects.equals(history.peekFirst().getRole(), "assistant")) {
            history.pollFirst();
        }
    }
}
OllamaChatMemoryApplication
package com.example.ollamachatmemory;

import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;
import org.springframework.boot.context.properties.ConfigurationPropertiesScan;

@SpringBootApplication
@ConfigurationPropertiesScan
public class OllamaChatMemoryApplication {

    public static void main(String[] args) {
        SpringApplication.run(OllamaChatMemoryApplication.class, args);
    }
}
index.html
<!DOCTYPE html>
<html lang="zh-CN">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>Ollama 聊天</title>
    <style>
        * {
            margin: 0;
            padding: 0;
            box-sizing: border-box;
        }

        body {
            font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, 'Helvetica Neue', Arial, sans-serif;
            background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
            height: 100vh;
            display: flex;
            justify-content: center;
            align-items: center;
        }

        .chat-container {
            width: 90%;
            max-width: 900px;
            height: 85vh;
            background: white;
            border-radius: 20px;
            box-shadow: 0 20px 60px rgba(0, 0, 0, 0.3);
            display: flex;
            flex-direction: column;
            overflow: hidden;
        }

        .chat-header {
            background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
            color: white;
            padding: 20px;
            text-align: center;
        }

        .chat-header h1 {
            font-size: 24px;
            margin-bottom: 10px;
        }

        .chat-header .subtitle {
            font-size: 12px;
            opacity: 0.9;
            margin-top: 5px;
        }

        .session-controls {
            display: flex;
            gap: 10px;
            justify-content: center;
            align-items: center;
        }

        .session-controls input {
            padding: 8px 12px;
            border: none;
            border-radius: 8px;
            font-size: 14px;
            width: 200px;
        }

        .session-controls button {
            padding: 8px 16px;
            background: rgba(255, 255, 255, 0.2);
            border: 1px solid rgba(255, 255, 255, 0.3);
            color: white;
            border-radius: 8px;
            cursor: pointer;
            font-size: 14px;
            transition: all 0.3s;
        }

        .session-controls button:hover {
            background: rgba(255, 255, 255, 0.3);
        }

        .chat-messages {
            flex: 1;
            overflow-y: auto;
            padding: 20px;
            background: #f5f5f5;
        }

        .message {
            margin-bottom: 15px;
            display: flex;
            animation: fadeIn 0.3s ease-in;
        }

        @keyframes fadeIn {
            from {
                opacity: 0;
                transform: translateY(10px);
            }
            to {
                opacity: 1;
                transform: translateY(0);
            }
        }

        .message.user {
            justify-content: flex-end;
        }

        .message.assistant {
            justify-content: flex-start;
        }

        .message-content {
            max-width: 70%;
            padding: 12px 16px;
            border-radius: 12px;
            word-wrap: break-word;
        }

        .message.user .message-content {
            background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
            color: white;
            border-bottom-right-radius: 4px;
        }

        .message.assistant .message-content {
            background: white;
            color: #333;
            border-bottom-left-radius: 4px;
            box-shadow: 0 2px 5px rgba(0, 0, 0, 0.1);
        }

        .message-role {
            font-size: 12px;
            color: #999;
            margin-bottom: 4px;
        }

        .message.user .message-role {
            text-align: right;
        }

        .chat-input-container {
            padding: 20px;
            background: white;
            border-top: 1px solid #e0e0e0;
        }

        .input-wrapper {
            display: flex;
            gap: 10px;
        }

        .chat-input {
            flex: 1;
            padding: 12px 16px;
            border: 2px solid #e0e0e0;
            border-radius: 12px;
            font-size: 14px;
            resize: none;
            outline: none;
            transition: border-color 0.3s;
        }

        .chat-input:focus {
            border-color: #667eea;
        }

        .send-button {
            padding: 12px 24px;
            background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
            color: white;
            border: none;
            border-radius: 12px;
            cursor: pointer;
            font-size: 14px;
            font-weight: bold;
            transition: transform 0.2s;
        }

        .send-button:hover {
            transform: scale(1.05);
        }

        .send-button:disabled {
            background: #ccc;
            cursor: not-allowed;
            transform: none;
        }

        .loading {
            display: inline-block;
            width: 20px;
            height: 20px;
            border: 3px solid rgba(255, 255, 255, 0.3);
            border-radius: 50%;
            border-top-color: white;
            animation: spin 1s ease-in-out infinite;
        }

        @keyframes spin {
            to { transform: rotate(360deg); }
        }

        .error-message {
            background: #fee;
            color: #c33;
            padding: 10px;
            border-radius: 8px;
            margin-bottom: 10px;
            text-align: center;
        }

        .empty-state {
            text-align: center;
            color: #999;
            padding: 40px;
        }

        .empty-state p {
            font-size: 16px;
        }

        .message-info {
            font-size: 11px;
            color: #999;
            margin-top: 4px;
            text-align: right;
        }

        .message.user .message-info {
            color: rgba(255, 255, 255, 0.8);
        }
    </style>
</head>
<body>
    <div class="chat-container">
        <div class="chat-header">
            <h1>🤖 Ollama AI 聊天</h1>
            <div class="session-controls">
                <input type="text" id="sessionId" placeholder="会话ID (例如: user1)" value="default">
                <button onclick="loadHistory()">加载历史</button>
                <button onclick="clearHistory()">清除历史</button>
            </div>
        </div>

        <div class="chat-messages" id="chatMessages">
            <div class="empty-state">
                <p>开始新的对话吧!输入消息并发送。</p>
            </div>
        </div>

        <div class="chat-input-container">
            <div id="errorContainer"></div>
            <div class="input-wrapper">
                <textarea 
                    id="messageInput" 
                    class="chat-input" 
                    rows="1" 
                    placeholder="输入您的消息..."
                    onkeypress="handleKeyPress(event)"
                ></textarea>
                <button id="sendButton" class="send-button" onclick="sendMessage()">发送</button>
            </div>
        </div>
    </div>

    <script>
        const API_BASE_URL = '/api/v1/chat';
        let isSending = false;

        // 自动调整输入框高度
        const messageInput = document.getElementById('messageInput');
        messageInput.addEventListener('input', function() {
            this.style.height = 'auto';
            this.style.height = Math.min(this.scrollHeight, 150) + 'px';
        });

        function handleKeyPress(event) {
            if (event.key === 'Enter' && !event.shiftKey) {
                event.preventDefault();
                sendMessage();
            }
        }

        function getSessionId() {
            return document.getElementById('sessionId').value.trim() || 'default';
        }

        function addMessage(role, content, info = null) {
            const chatMessages = document.getElementById('chatMessages');
            
            // 移除空状态提示
            const emptyState = chatMessages.querySelector('.empty-state');
            if (emptyState) {
                emptyState.remove();
            }

            const messageDiv = document.createElement('div');
            messageDiv.className = `message ${role}`;
            
            const messageContent = document.createElement('div');
            messageContent.className = 'message-content';
            
            const roleDiv = document.createElement('div');
            roleDiv.className = 'message-role';
            roleDiv.textContent = role === 'user' ? '您' : 'AI助手';
            
            const textDiv = document.createElement('div');
            textDiv.textContent = content;
            
            messageContent.appendChild(roleDiv);
            messageContent.appendChild(textDiv);
            
            // 添加消息信息(如上下文数量)
            if (info) {
                const infoDiv = document.createElement('div');
                infoDiv.className = 'message-info';
                infoDiv.textContent = info;
                messageContent.appendChild(infoDiv);
            }
            
            messageDiv.appendChild(messageContent);
            chatMessages.appendChild(messageDiv);
            
            // 滚动到底部
            chatMessages.scrollTop = chatMessages.scrollHeight;
        }

        function showError(message) {
            const errorContainer = document.getElementById('errorContainer');
            errorContainer.innerHTML = `<div class="error-message">${message}</div>`;
            setTimeout(() => {
                errorContainer.innerHTML = '';
            }, 5000);
        }

        async function sendMessage() {
            if (isSending) return;

            const message = messageInput.value.trim();
            if (!message) {
                showError('请输入消息内容');
                return;
            }

            const sessionId = getSessionId();
            const sendButton = document.getElementById('sendButton');

            isSending = true;
            sendButton.disabled = true;
            sendButton.innerHTML = '<span class="loading"></span>';

            // 添加用户消息到界面
            addMessage('user', message);
            messageInput.value = '';
            messageInput.style.height = 'auto';

            try {
                const response = await fetch(API_BASE_URL, {
                    method: 'POST',
                    headers: {
                        'Content-Type': 'application/json',
                    },
                    body: JSON.stringify({
                        sessionId: sessionId,
                        message: message
                    })
                });

                if (!response.ok) {
                    throw new Error(`HTTP error! status: ${response.status}`);
                }

                const data = await response.json();
                
                // 显示上下文信息
                const contextInfo = data.historyMessageCount ? 
                    `上下文消息数: ${data.historyMessageCount}` : '';
                
                // 添加AI回复到界面
                addMessage('assistant', data.answer || '收到回复', contextInfo);

            } catch (error) {
                console.error('Error:', error);
                showError('发送消息失败: ' + error.message);
            } finally {
                isSending = false;
                sendButton.disabled = false;
                sendButton.textContent = '发送';
            }
        }

        async function loadHistory() {
            const sessionId = getSessionId();
            
            try {
                const response = await fetch(`${API_BASE_URL}/${sessionId}/history`);
                
                if (!response.ok) {
                    throw new Error(`HTTP error! status: ${response.status}`);
                }

                const history = await response.json();
                
                // 清空当前消息
                const chatMessages = document.getElementById('chatMessages');
                chatMessages.innerHTML = '';

                if (history.length === 0) {
                    chatMessages.innerHTML = `
                        <div class="empty-state">
                            <p>暂无历史记录</p>
                        </div>
                    `;
                    return;
                }

                // 显示历史消息
                history.forEach(msg => {
                    addMessage(msg.role, msg.content);
                });

            } catch (error) {
                console.error('Error loading history:', error);
                showError('加载历史记录失败: ' + error.message);
            }
        }

        async function clearHistory() {
            const sessionId = getSessionId();
            
            if (!confirm('确定要清除该会话的所有历史记录吗?')) {
                return;
            }

            try {
                const response = await fetch(`${API_BASE_URL}/${sessionId}/history`, {
                    method: 'DELETE'
                });

                if (!response.ok) {
                    throw new Error(`HTTP error! status: ${response.status}`);
                }

                // 清空界面
                const chatMessages = document.getElementById('chatMessages');
                chatMessages.innerHTML = `
                    <div class="empty-state">
                        <p>历史记录已清除。开始新的对话吧!</p>
                    </div>
                `;

            } catch (error) {
                console.error('Error clearing history:', error);
                showError('清除历史记录失败: ' + error.message);
            }
        }

        // 页面加载时自动加载默认会话的历史
        window.addEventListener('load', () => {
            loadHistory();
        });
    </script>
</body>
</html>
application.yml
server:
  port: 8080

ollama:
  base-url: http://localhost:11434
  default-model: llama3.2:3b       --选用模型
  max-history-messages: 50   --上下文长度
#llama3.2:1b
#qwen:0.5b
#llama3.2:3b
Readme.md
# Ollama Chat Memory (Spring Boot 版)

一个基于 Spring Boot 和 Ollama 的本地 AI 聊天应用,支持**多会话管理**、**上下文记忆**以及**Web 聊天界面**。

## ✨ 核心功能

- 🤖 **本地 AI 对话**:对接本地运行的 Ollama 服务,支持任意模型(如 qwen, llama3 等)。
- 💾 **上下文记忆**:自动维护每个会话的历史记录,AI 能记住之前的对话内容。
- 👥 **多会话隔离**:通过 `sessionId` 区分不同用户或不同话题的对话。
- 🌐 **Web 聊天界面**:内置美观的聊天网页,无需额外配置前端环境。
- 🔌 **RESTful API**:提供标准接口,方便集成到其他系统。

---

## 🛠️ 技术栈

- **开发语言**: Java 1.8 (JDK 8)
- **后端框架**: Spring Boot 2.7.18
- **HTTP 客户端**: Spring RestTemplate
- **依赖管理**: Maven
- **AI 引擎**: Ollama (本地运行)

---

## 🚀 快速开始

### 1. 前置准备

确保你已经安装并启动了 [Ollama](https://ollama.com/),并且拉取了一个聊天模型(例如 `qwen:0.5b`):


启动 Ollama 服务
ollama serve
在另一个终端运行模型(首次运行会自动下载)
ollama run qwen:0.5b
### 2. 启动项目

在项目根目录下执行:

 mvn clean package java -jar target/ollama-chat-memory-0.0.1-SNAPSHOT.jar
或者在开发工具(IntelliJ IDEA)中直接运行 `OllamaChatMemoryApplication` 主类。

### 3. 访问应用

- **Web 聊天界面**: 打开浏览器访问 [http://localhost:8080](http://localhost:8080)
- **API 基础地址**: `http://localhost:8080/api/v1/chat`

---

## 📡 API 接口文档

### 1. 发送消息(带上下文)

向指定会话发送消息,AI 会基于历史上下文进行回复。

- **URL**: `POST /api/v1/chat`
- **Content-Type**: `application/json`
- **请求体**:

json { "sessionId": "user-1001", "message": "你好,请记住我喜欢Java", "model": "qwen:0.5b" }
  > 注:如果不传 `model`,将使用配置文件中的默认模型。

- **响应示例**:

json { "sessionId": "user-1001", "model": "qwen:0.5b", "answer": "好的,我已经记住了你喜欢 Java。", "historyMessageCount": 2 }
### 2. 查看会话历史

获取指定会话的所有聊天记录。

- **URL**: `GET /api/v1/chat/{sessionId}/history`
- **示例**: `GET /api/v1/chat/user-1001/history`

### 3. 清除会话历史

删除指定会话的所有记忆。

- **URL**: `DELETE /api/v1/chat/{sessionId}/history`
- **示例**: `DELETE /api/v1/chat/user-1001/history`

---

## ⚙️ 配置说明

编辑 `src/main/resources/application.yml` 可以修改以下配置:

yaml server: port: 8080
ollama: base-url: http://localhost:11434 # Ollama 服务地址 default-model: qwen:0.5b # 默认使用的模型 max-history-messages: 20 # 每个会话保留的最大历史消息数

---

## 📂 项目结构

D:\demo01
├─ pom.xml
├─ settings.xml
└─ src
   └─ main
      ├─ java
      │  └─ com
      │     └─ example
      │        └─ ollamachatmemory
      │           ├─ OllamaChatMemoryApplication.java        // Spring Boot 启动类
      │           ├─ config
      │           │  └─ OllamaProperties.java                // Ollama 配置属性(baseUrl、defaultModel 等)
      │           ├─ controller
      │           │  ├─ ChatController.java                  // /api/v1/chat 相关 REST 接口
      │           │  └─ HomeController.java                  // 根路径 "/" 转发到 index.html
      │           ├─ dto
      │           │  ├─ ChatRequest.java                     // 聊天请求体(sessionId, message, model)
      │           │  └─ ChatResponse.java                    // 聊天响应体(sessionId, model, answer, historyMessageCount)
      │           ├─ model
      │           │  ├─ ChatMessage.java                     // 单条消息(role, content)
      │           │  ├─ OllamaChatRequest.java               // 发给 Ollama 的请求体(model, messages, stream)
      │           │  └─ OllamaChatResponse.java              // Ollama 返回的响应体(model, message)
      │           └─ service
      │              └─ OllamaChatService.java               // 核心聊天服务:会话历史管理 + 调用 Ollama
      └─ resources
         ├─ application.yml                                  // 应用配置(端口、Ollama 地址、maxHistoryMessages 等)
         └─ static
            └─ index.html                                    // 前端聊天页面(支持上下文记忆展示)


---

## 💡 常见问题

1. **报错 "Port 8080 is already in use"**
   - 原因:8080 端口被其他程序占用。
   - 解决:修改 `application.yml` 中的 `server.port`,或关闭占用端口的进程。

2. **AI 回复很慢**
   - 原因:本地模型较大或硬件性能有限。
   - 建议:尝试更小的模型(如 `qwen:0.5b` 或 `tinyllama`)。

3. **无法连接 Ollama**
   - 检查 Ollama 是否已启动 (`ollama serve`)。
   - 检查 `application.yml` 中的 `ollama.base-url` 是否正确。

测试与部署

  • 打包maven/配置maven

由于我的环境变量不一样所以我需要强制指定我的项目环境

 $env:JAVA_HOME = "C:\Program Files\Java\jdk1.8.0_152"; $env:Path = "$env:JAVA_HOME\bin;$env:Path"; 

如果环境正确直接运行打包maven命令

mvn clean package -DskipTests -s D:\demo01\settings.xml
  • 启动OllamaChatMemoryApplication类,打开项目

启动成功直接访问Ollama 聊天  http://localhost:8080/

扩展方向

  • 结合RAG增强知识库
  • 微调模型适应垂直领域
  • 接入语音输入输出(JavaFX或Android集成)

结语

  • 本地模型的局限性(如算力需求)
  • 后续学习资源推荐(Java ML库文档、模型Hub链接)
  •  更多模型知识可以参考introduction | LangChain中文网

Logo

DAMO开发者矩阵,由阿里巴巴达摩院和中国互联网协会联合发起,致力于探讨最前沿的技术趋势与应用成果,搭建高质量的交流与分享平台,推动技术创新与产业应用链接,围绕“人工智能与新型计算”构建开放共享的开发者生态。

更多推荐